A Comprehensive Fault Model for Concurrent Error Detection in MOS Circuits by Halperin, Daniel Lee
REPORT CSG-25 DECEMBER 1983 
11 COORDINATED SCIENCE LABORATORY 
-
COMPUTER SYSTEMS GROUP 
I LOAN COPY 
A COMPREHENSIVE FAULT 
MODEL FOR CONCURRENT 
ERROR DETECTION IN 
MOS CIRCUITS 
DAN I EL LEE HALPERIN 
APPROVED FOR PUBLIC RELEASE. DISTRIBUTION UNLIMITED. 
U IVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN 
Unclassified
i u r t t y  c l a s s i f i c a t i o n  o f  t h i s  p a g e
REPORT DOCUM ENTATION PAGE
a. R E P O R T  S E C U R I T Y  C L A S S I F I C A T I O N
Unclassified
1b. R E S T R IC T IV E  M A R K IN G S
None*
I 2a. S E C U R IT Y  C L A S S IF IC A T IO N  A U T H O R IT Y
N/A
3. O E C LA S S l F I C A T  I O N /O O W N G  R A  D IN G  S C H E D U L E
N/A
3. O IS T R I8 U T 1 0 N /A  V A I L A B IL IT Y  OF R E PO R T
Approved for public release, distribution 
unlimited
P E R F O R M IN G  O R G A N IZ A T IO N  R E P O R T  N U M B E R IS )
CSG-25
5. M O N IT O R IN G  O R G A N IZ A T IO N  R E P O R T  N U M B E R (S )
N/A
ia. N A M E  OF P E R F O R M IN G  O R G A N IZ A T IO N
1 Coordinated Science Lab. 
/University of Illinois
6b. O F F IC E  S Y M B O L  
(If applicable}
N/A
7a. N A M E  O F M O N IT O R IN G  O R G A N IZ A T IO N
Office of Naval Research
16c. A D D R E S S  (City. State and ZIP Codel
■■j 1101 W. Springfield Avenue 
^'Urbana, Illinois 61801
t
\r
7b. A D D R E S S  (City. State and ZIP Code)
2511 Jefferson Davis Highway 
Arlington, Virginia 22202
i. N A M E  OF F U N D IN G /S P O N S O R IN G
o r g a n i z a t i o n
Naval Electronics Syst. Comm.
8b . O F F IC E  S Y M B O L  
(If applicable)
N/A
9. P R O C U R E M E N T  IN S T R U M E N T  ID E N T IF IC A T IO N  N U M B E R
N00039-80-C-0556
. A D D R E S S  (City. State and ZIP Code)
2511 Jefferson Davis Highway 
Arlington, Virginia 22202
10. S O U R C E  OF F U N D IN G  NOS.
!■
7 t i  i Lc Include Security Classification) A COMPREHENSIVE FAULT
MODEL FOR CONCURRENT ERROR DECT. IN MOS CIRCUITS
P R O G R A M P R O JEC T T A S K W O R K  U N IT
E L E M E N T  NO. NO. NO. NO.
N/A N/A
s
N/A N/A
12. P E R S O N A L  A U T H O R (S ) Daniel Lee Halperin
a. T Y P E  OF R EPO R T 13b. T IM E  C O V E R E D 14. O A T 5  OF R E P O R T iYr.. Mo.. Day> 15. PAG E C O U N T
Technical F R O M  TO December 10, 1983 206l S U P P L E M E N T A R Y  N O T A T I O N
N/A
r C O S A T I COOES 1 a S U B JE C T  T E R M S  1 Continue on reverse if necessary and identify by block num ber)
IE L O 1 G R O U P I SUB. GR. Concurrent Error Detection, Fault Models, Indeterminate Faults.,;
■ ' MOS Circuits, Physical Failure Modes, Separable Codes, Ternary f1 l 1 Algebra, Totally Self-Checking Circuits *
A B S T R A C T  iContinue on reverse tf necessary and identify by block number)
i A comprehensive fault model is developed for concurrent error detection in MOS
integrated circuits. This fault model is based on a thorough examination of physical 
\ failures in MOS integrated circuits. Models of MOS circuits are also developed which 
are used to determine the behavior of these circuits under failure. It is found from 
this analysis that many types of physical failures may result in Logic signals that are 
not well-defined. In particular, it is shown that physical failures may lead to constant 
values that are neither logic 0 nor logic 1, timing failures, or oscillation. The concept 
of indeterminate faults is developed to describe the behavior of such failures. It is 
shown that most traditional fault models are unable to model the behavior of a circuit 
with an indeterminate fault correctly.
i Ternary algebra is used to facilitate the analysis of circuits which receive
[ indeterminate value inputs. Using ternary algebra, necessary conditions are developed
for the propagation of indeterminate values through a circuit. It is shown that in (over) \
j 20. O I S T R I  0 U T I O N / A  V A I  L A B I  L l T Y  O F  A B S T R A C T
| tdft - — _
C L A S S I F I E D / U N L I M I T E D  -X S A M E  AS RPT. _ j  O T IC  USERS □
* 2 a .  N A M E  O F  R E S P O N S I B L E  I N D I V I D U A L
21. A B S T R A C T  S E C U R I T Y  C L A S S I F I C A T I O N
Unclassified
1
2 2b .  T E L E P H O N E  N U M B E R  
(In c lu d e  .- i rec C o d e)
| 22=.  O F F I C E  S Y M B O L
NOME
U n c 1 a s sFORM 1473, 82 APR E D I T I O N  o f  1 j A N  73 is O B S O L E T E . i f i. e d
t'.Jncla sif rl 
: FIITY C:..ASS l l= ICATI C N 01' Tl-4 15 PAGE 
I  
•· Fl-e C  ~S I  T I  
 e • 
• 11 I ~   I T l lO / l l.l  
/  
,. o ecu ss11= , VII FI O eC l.  
 li , b o
I 
I I  im  
I S FII I  I A I  S Fl( l 
L  /  
- , .._ lla f'I I  l& 80 e  
\C i . I o t, fc i. > ff i aval  
l .! ni 111 i  /  
•~ 11,CCPIIE I l ien <I  .> OO f •  •> 
~  . ri fi ve so avi
/Urba , lin i  1 rl i t , i i  t · 
1- Ol / Sf' A Q tl l'll'I i.. A  
ORGANI ZAT ION It/ u: /6 J 
t aval l t i t . . /  0  
~ . .OOA i • a.1 0 1' C . 
s avi i a  l'I Ft  flAOJE.    
I rl t . ir i i   1.11. &     /  ~  I  /  
. 1"1 Tl.: l u u :,~,:11, i r i;t4a1fi'-'Gllo ,a, {.;U PRt:.tt  1  
I  . lN J R  . I I f' ~ . .a.U M ( I 
aniel al r i  
• . , 
.  "I ;tQ  
ni l 
FI O 
l' i'I ~  
E .. e C ,  . lf  ,,, 
e ber ,  
,  
 
___ "?"' __ c_ -'- '-A-T__,.1-"' ..;;..;.O_E_S _____ -11  l!J o11 u11,11 011 y1111rtt 1( '11!CU""~ o  1 ."h{ lo'-'11 ! 
_1e;;.:..;::....o_.,...._a_;;i_o'"""'u-P __ l __ ....;s;..u ... e_._c:.; ;.~-- --i oncurr t r  etecti , l od l s , ln t rrni :.  aul::s.;-
• H ir uits, ysi l ai o , ~ ar l odes, l 
1•_----~---~----------1 l ebra, t l l  lf-C i  i it  
I 
I 
I 
I 
S ,l'I • ,u c ,.,, ~ H •f "• w ,.,, 011  l { t lt ••1 
 prehensi  l odel curr t  
i t s . i l odel h o inat si l
, n e i t s. odels it l hi
r i i r ~it r . l o rom
r.al i t a si l s a lt l i ~ l t
t ell f . l rt l r, ho c t si l u a L~ st~ t
l t i L gi O , i~ n '. ~es, i t i . t
n i at ri i r . le
t os~ i l l odels l odel i it
it n i at l rr ctl . 
r ~ r ~ il l si f :c ~ i  
nde.t~ i at l (s. Csi r , dit  
E r t ~ t ~i t l hrou it . l~ ho t r i
1
:, . !)1 , I B I l'<1,ol.V 1 . 8 ll. ;  0 1' ~ ,.  
,.,.. -
c :..,>. I F• !J/ I.J t O 1' Ii/i  ....J C 0 
,, . ..o.a "TA  c1..,;. I  
 
I 
I 
1
,: :',!A E ~ i" I ..!a J O  
' 
,., . 3  
1
. f'  :.. '-'C   ~
1/,1 u, ~,.,,c: a i 
;;;;)1 10 01'  , 1'.) 1$ C E: . 
~., . I : $ !'l01. 
>=  
i.:;iclc1ssi.fi,•O 
Unclassified__________________
S E C U R IT Y  C L A S S IF IC A T IO N  O F  T H IS  PAG E
many cases, an indeterminate value can propagate through a circuit even when a Boolean valt 
cannot propagate.
The methodology of totally self-checking systems is used to provide concurrent error
11 l ssi i  
l. IPICATIOPII oi; l"  
a s, i t l at hrou it ol 1
t agate, 
et l f-che e i urr t  
detection. It is shown that the tradit i onal definitions of the totally self-checking 
property are inappropriate for failures which include i ndeterminate faults. A new definition 
of the totall y self-checking property is developed which is compatible with i ndeterminate i 
faults. It is shown that under our fault models, duplication may be used to provide a 
totally self-checking implementation for any function. Procedures are developed to deter- · 
mine if a function has an implementation using a separable code which may provide concurrent .. 
error detection at a lower cost than duplication. Issues involved in the interconnection tj;_ 
of several totally self-checking circuits are considered , as well as the requirements for fJ' . 
checkers in systems which may experience indeterminate failures. 
-I I 
' ~- ' 
I 
-! I 
I ' 
·1 
I 
--, 1 
I 
m \jnrrnn C0MPREHENSIVE FAULT MODEL FOR
CONCURRENT ERROR DETECTION IN MOS CIRCUITS
BY
DANIEL LEE HALPERIN
B.S., University of Tennessee, 1978 
University of Illinois, 1981
THESIS
Submitted in partial fulfillment of the requirements
in Electrical Engineering m  the Graduate College of the g
for the degree of Doctor of Philosophy in El
University of 1 1 1 inois at Urbana-Champaign, 1984
Urbana, Illinois
I 
A COMPREHENSIVE FAULT MODEL FOR 
CONCUR ENT ERROR DETECTION IN MOS CIRCUITS 
BY 
D EL LEE H N 
, y ess , 1
M.S., ni ersit  f llin i ,  
THESIS 
Submitted in partial fulfill ent of the requirements 
f or the degr e of Doctor of Philosophy in Electrical Engineering 
in tho Graduate College of the 
University of Illinois at Urbana- Champaign, 1984 
Urb na, Illinois 
A COMPREHENSIVE FAULT MODEL FOR
CONCURRENT ERROR DETECTION IN MOS CIRCUITS
Daniel Lee Halperin, Ph.D.
Department of Electrical Engineering 
University of Illinois at Urbana-Champaign, 1984
A comprehensive fault model is developed for concurrent error 
detection in MOS integrated circuits. This fault model is based on a 
thorough examination of physical failures in MOS integrated circuits. 
Models of MOS circuits are also developed which are used to determine 
the behavior of these circuits under failure. It is found from this 
analysis that many types of physical failures may result in logic sig­
nals that are not well-defined. In particular, it is shown that physi­
cal failures may lead to constant values that are neither logic 0 nor 
logic 1, timing failures, or oscillation. The concept of indeterminate 
faults is developed to describe the behavior of such failures. It is 
shown that most traditional fault models are unable to model the 
behavior of a circuit with an indeterminate fault correctly.
Ternary algebra is used to facilitate the analysis of circuits 
which receive indeterminate value inputs. Using ternary algebra, neces­
sary conditions are developed for the propagation of indeterminate 
values through a circuit. It is shown that in many cases, an indeter­
minate value can propagate through a circuit even when a Boolean value 
cannot propagate.
The methodology of totally se1f—checking systems is used to provide 
concurrent error detection. It is shown that the traditional defini­
tions of the totally self—checking property are inappropriate for
I 
PREllE SI  
I  
aniel alperi , . . 
epart ent l ctri l gi eer
niversit lin r a paign,  
 prehensi l odel curr t
t t it . i l odel
h o inati ysi al it .
odels it hi i
i r it r . rom
l si t a si l a lt i -
l t t ell fi . rt l r. t si-
l u a st t t i i O r
, im , il . t i at  
n! r a i r .
t ost io l l odel l odel
i r it it i t lt rr ctl . 
r il l si it
hi n nlli t l ts. si r , -
dit at n i t
hro i it. h t 111any s, t r-
i at l at it ool
t agate. 
et l lf-chec te i  
curr t t cti . t io l -
io f-chec  ert n r pri t  
failures which include indeterminate faults. A new definition of the
totally self-checking property is developed which is compatible with 
indeterminate faults. It is shown that under our fault models* duplica­
tion may be used to provide a totally self-checking implementation for 
any function. Procedures are developed to determine if a function has 
an implementation using a separable code which may provide concurrent 
error detection at a lower cost than duplication. Issues involved in 
the interconnection of several totally self-checking circuits are con­
sidered* as well as the requirements for checkers in systems which may 
experience indeterminate failures.
u hi n u n i t lt . fi i io  
f-ch ert hi pati l it
i t lt . t er r l odels, li -
io a i f-che l entati
t . r r i t o
l entati r l hi a i curr t
r t t t o er st pli t . n
erco t r l f-chec i -
, ell e e t er e hi a
r i at . 
iii
Acknowledgment
The author wishes to express deepest thanks to his advisor. Dr. Ed 
Davidson, for his support, encouragement, and advice throughout this 
thesis project. He is deeply appreciative for the many long hours Dr. 
Davidson spent assisting with this thesis. The author would also like 
to thank Dr. Janak Patel, Dr. Jacob Abraham, and Dr. Dan Gajski for 
serving on his final committee. Finally, the author would like to thank 
his colleagues in the Computer Systems Group for their support, ideas, 
and friendship.
The author would like to dedicate this thesis to his parents, 
Joseph and Sita Halperin, and to his wife, Beverly. The support, sacri­
fices, and understanding provided by the author's family have made his 
education in general and this thesis in particular possible.
 
c l ent 
i r est h i i r, r. R
avi s n, i port, ent, i o t
i j ct. l r ci t a o r r.
avi t in it si . t r oul i  
h r. atel, r. br . r. a aj i
i l mi tee. i all , r oul ik
i ea puter e r i ort, s,
ie i . 
t r oul i t i i i r nts,
i  alperi , i ife, everl . ort, ri-
, er t r' il a i
at eral i rt l ssi l . 
iv
Chapter
TABLE OF CONTENTS
1* Introduction ...............
1.1. Error Detection Strategies ......
1*2. Fault Models ..... „
1.2.1. Single Stuck-At Fault Model ...
1.2.2. Unidirectional Fault Model ....
1.2.3. Bridging Fault Model ........ .
1.2.4. Stuck-Open Fault Model ........
1.3. Overview of Research ........
2. Physical Failure Modes for MOS Integrated Circuits
2.1. Interconnect Failures ...........
2.1.1. Metal Interconnect Failures .........
2.1.2. Polysilicon Interconnect Failures ___
2.1.3. Diffusion Interconnect Failures .....
2.1.4. Dielectric Failures ........ .
2.2. Transistor Failures .........
2.2.1. Parameter Shift Failures .............
2 .2 .2 . Breakdown Failures .........
2.3. Radiation-Induced Soft Failures .....
3 . Behavior of Failed Circuits ..........
3.1. Summary of Failure Mechanisms ___
3.2. Circuit Models ............
3.2.1. Static NMOS Inverter Model
3.2.2. Static CMOS Inverter Model
3.2.3. Dynamic NMOS Inverter Model
3.3. Response of Failed Circuits .............. ..
3.3.1. Response of Circuits with Shorts ..............
3.3.2. Response of Circuits with Opens ...............
3.3.3. Response of Circuits to Noise ...............
3.4. Response of Good Circuits to the Output of a Failed 
circuit ••••••••■»«
3.4.1. Metastable Operation .........
3.4.2. Response of Combinational Logic
4. Concurrent Error Detection of Physical Failures
4.1. Indeterminate Faults ..........
4.1.1. Ternary Algebra ......... .
Page
1
1
4
6
6
7
9
12
14
15 
17 
21 
22 
24 
27 
27
29
30
32
32
36
41
53
57
63
63
76
79
80 
81 
88
102
102
104
1'9' 
TABLE OF CONTENTS 
Chapter Page 
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 
1.1. Error Detection Strategies •. ••.•.•••••••••..•.•..••.•••. • 1 
1.2. Fault . odels··•·••·•·••·••··•··•··•··•··••••···•·••·••··• 4 
1.2.1. Single Stuck-At Fau t M el •  ........ .... .••• . • •• 6 
1.2 .2. U rectional Fa t M l •  • • • • • • . .. .. • • .. • • .. • • 6 
1.2.3. Bridging Fa t M l • •• . .................... 7 
1.2.4  Stuck-Open Fa t odel......................... .. . 9 
1.3. Overview of Research ..•........• . .... , •..... ,.. ... . • .. ..• 12 
2  P  F lure M  for M S Integrated C  ..•.•••.••.. 14 
2 . . Intercon t F ilure  , •.••• , .•••. , ••• , •••.••..•.• , •• , . • • . 15 
2 l et  Intercon  u  • . .. • . • • . .. .. • • . • • .. • • 17 
l ico Intercon  •.•••••.•••••..•• 21 
, i o Interco t i ....••.••• ,........ 22 
2.1.4, Dielectric Failures ···•••·····•·••·· ·· ·•··•·····•· 24 
2 • 2 . r ans i st or Fa i 1 ur es ••• , ••..•. , ••••••••• , •••••••• , • . • • . • • • 2 7 
. . . eter ift ailures ....... .............. 27 
, . r  ail ••••••  , ........... . ........... 29 
. . adiati n  oft  i 1 r es . . • .. • ...........•..•... 30 
3. ehavior of ailed ircuits 32 
3 . . mary f ail re echanis s . . • ••• .• , . • • • . . • • • . • • • . . . • • . 32 
3.2. Circuit odels • •••• •. • • ••• •• •.•• .• • . •••••• •. •• .• •.•• .•• •• 36 
3 .3 . 
3 .  .1. tatic N OS I verter odel . . . • . . • . . . • . • • • • . • • • . • • . 41 
3.2.2, tatic C OS I verter odel •.••••••.•.••••••.••..•. 53 
3 . 2 .3 . Dynamic NMOS Inverter Model • ...•. . ....•.• . . , ..••. , 
Response of Failed 
3.3.1. Response of 
3.3.2. Response of 
Circuits 
Circuits with Shorts • • ••• .•.•.• , •..• 
Circuits with Opens •• • ••••••••.•••••• 
57 
63 
63 
76 
3.3.3. Response of Circuits to Noise .•.• • . •• • .• ..•• • 79 
3 .4. Response of G od Circuits to the Output of a Failed 
C i r cu i t . . . . . . . . . . . . . . . . . . . . . • . . . . . . . • . . . . . . . . . . . . . . . . • . . . . . . . . 80 
3,4.1. Metastable Operation . • . , . • , . .•..•.•...... , .• ••... 81 
3.4.2. Response of Combinational Logic .•.•. ,, .••.•• , . . .•. 88 
4. Concurrent Error D tection of Physica l Failures .•••.•......•.. 102 
4 . 1. I eter inate aults, .•••••• • ••.••. , .• , ..•..•••...•..•.•. 102 
4 .1 .1. er ar  lgebra ..•.....••...•....•.•..•.•. , . . • . . • . 104 
VTABLE OF CONTENTS (cont.)
Chapter Page
4.1,2. The Effects of Hazards on Sensitization .......... 109
4.2. Concurrent Error Detection .......................... . 114
4.2.1. Totally Self-Checking Circuits .................... 114
4.2.2. Checker Strategy ..... ............................ 121
4.3. CED under a Simplified Indeterminate Fault Model ......... 128
4.3.1. Fault Model Assumptions ...... ....................  128
4.3.2. Separable Codes ................................... 134
4.3.3. Finding Economical Totally Self-Checking Imple­
mentations ........................................... 140
4.3.4. Check Vector Generation..... .....................  156
4.4. CED Under a General Single Failure Indeterminate Fault
Model .... .................................................... 168
4.4.1. Fault Model Assumptions and Properties ...........  168
4.4.2. Economical Implementations for the General In­
determinate Fault Model ................................   173
4.4.3. Check Vector Generation ............................  178
4.5. Checker Requirements ...................................   179
5. Conclusion ................................     185
5.1. Evaluation of Fault Model ........................   185
5.1.1. Fault Model Accuracy ...............................  185
5.1.2. Ease of Analysis ............................    189
5.1.3. Cost of Fault Tolerance ............................  191
5.2. Summary .................................................... 194
5.3. Suggestions for Future Research ...........................  197
References ......................................................... 199
Vita ............................................................... 206
-------------------------------- ---- - - -- - --- .. - - - -
 
t . ) 
hapter
. . . T  s o   on si ization •..••••.•••
 . . oncurr t r r etect •••• , •• ••. •• • ••.••••••••••• 
 
 
 
 
 
 .  . . ot l -C i ir it .................. .
. , . ec er trategy•••••·••••·······•··········•••·•• 
, . er i plif i t ult odel ·••·••••·  
 .  , . ult odel ss pti • •   • • • . • • • .. • • . • • • • • • . • •  
. . , ar l  es ·•••·•···•··•••••······•·••·····•••  
. , , ical ot l -C i pl -
entati s  . . • • . . . . . . . . . . . . . . . . . . . . . . . • . . . . • . . • .. • . . . . . • . .  
 . . ect r e tion • • • • • • . • • • • • • • . • • • • . • . . • • • •  
. . nder eneral l i i at ult 
o  1 •   • • • . . . . • . • . • . . . . • . . • . . • • • • . . • . . • . . • • . . . • • . . • . . . • • . • • . •   
. . . ult odel ss pti s r erti ·•••••···•··  
 .  . . i .l pl entati eneral -
i t ult odel •••••• , • •  • • • • •  • . • • . • . • • . • • . . • • • • 173 
. . . ect r enerat ••••••••• •••••• , , •••.•• , • . 178 
 .S. ec er equi ent •• • , • • • . . • . . . • • • • • • • • • • • • • • • • • . • • • 179 
S. onclusion•·••••••••••••··•••••••••••••••• • •·•················ 185 
. . al at ult odel •••• •••• ••••·•···••••••·•• ••••• 185 
. . , ult odel ccur •. •••  ••    •  •  .  . . .  .••• •• 185 
5.1.2, Ease of nalysis •···•···•··•············•··••••••• 189 
 . . . ost ult l • • • • •  • •  • • •   • .  .  . .  • • 19  
S •  . ar  • • •. •. • • , . • . . . . . . . . • . . . . . . . . • . . • • . . . . . . 194 
S . . gesti t r es • , • •• ••• ••• • ••••  , • • • . . • • 197 
ef c             .  . • . . . . . . . • . . . . . • . . . . . . . • . . . • . . . . . . . . .     199 
it  . . . . • • . . . . • . . • . • • . . • . . . . . • . . . . . . . . . . . . . . . . . • . . . . . . . • . . . • . . . . . 2 06 
vi
LIST OF FIGURES
Figure
1.1.
1*2, : w i" - a n d *■*«.*.............
Page
1.3. A n Z s T  C W U h  * StnCt- ° ^ n F- lt .....
S Clrcnit a Stuck-Open Fault
2,1. Summary of Scaling Factors ..
3.1. 
3 ,2. MOS Transistor Symbols
3.3.
3 .4 Inverter Circuit
3.5. Resistive Model of I„verter ..
3 6
3.7. W a g e  Switching Tia,e vs. 2 Ratio .... CMOS Inverter Circuit3.8.
3.9. Dynamic Shift Register ................ 55
3.10. ™ l 1StlTe..M!del °f Transistor
59
3 .11. 1US and CM0S Inverters • 62
3.12. tput Node to Input Node Short
64
3.13. Resistive Model of Failed Inverter
• 67
3.14. M: ; i r : i % MOdeI °f TW0 Shorted 1  odel of inverter String ogether .,
69
7 A
3 .15 / 4
3.16
Small Noise .°... * ~  ° VS* Number of Inverters for
91
Large N o l s l  ^ . . .. 1 ° vs* N™ b*r of Inverters for
94
4.1.
4.2. Ternary Algebra Truth Tables
96
4.3 . Example of a Static Hazard
106
4.4. ally Sell Checking Module -
110
4.5. Metastable Detection Circuit 118
4.6. T yp es  of Bridging Fanlts .. 124
4.7. xrcuit Implementation ..
130
4.8. 1331354.9.
4.10. Fanlt Behavior of Full Adder 143
4.11. ^erger Diagram for Fnll Adder Example 146
4,12
ector AND Example ... 150
4.13. TVo-Bi^Adder E ^  ^  -
157
4.14 it Adder Example . 159
Behavior of Input-Output Fanlts in Two^BU Adder ...
* * 174
176
Figure 
1.1. 
1.2. 
1.3. 
2.1. 
3 .1. 
3 .2. 
3 . 3. 
3.4 . 
3 • S. 
3 .6 . 
3. 7. 
3.8. 
3 . 9. 
3 .10. 
3 .11. 
3.12. 
3.13. 
3.14. 
3.15 . 
3 .16 . 
4.1. 
4 .2. 
4 .3. 
4.4. 
4.S. 
4 .6. 
4 . 7. 
4.8. 
4 . 9. 
4 .10. 
4 .11. 
4.12 . 
4 .13. 
4 .14 . 
LIST OF FIGURES 
An Ex .. ple of a Wired-AND Bridging Faolt ..... • •.• . ... • .. 
A CMOS NAND Ga te with a Stuck-Open Fault •••••••••••••••• 
An NJIOS Circuit •itb a Stuck-Open Fault ••••. • ••.•••••••• 
SUJDJ11ary of Scaling Factors ··•· •· ·•··•·····•····•··•·•··• 
MOS Transistor Symbols • • •••••••••••••••••.•••••••••••••• 
Detini tiona of Vo1ta1u and Polar! ties • , •••••••••••.••.• 
NlfOS Inverter Circuit 
Resistive Model of an Inverter ••• , , • , •• , ••••••.••••••••. 
Volt•s• Limits va . Z •.••..•••••••..• •. .• . ••••.••.••••••• 
Averag  Sw tching Time vs. Z R io •.•••••••••••••.•••••• 
CM.OS Inverter Circu.i t ................•...........•....• . 
Dynamic Shift Register . . .....•.. , .. , ...........•.. . ..•.. 
Resi1tive Hodel of Couplina Transistor ••••••••••••.•••.• 
NMOS a  OS I rt rs ••••..•.• • .••..•• • •••.••.•.• , •.•• 
Out t ode t  Input ode ort •• • ••••••.•• .• •• • ••••••.• 
esist i  odel f ail  I ert r ............... . ..... . 
Resistive Model of Two Outputs Shorted Together • • •••• • •• 
Model f I verter tri  • ••.••••••••••..••..•• . •••••.••• 
ProbabU ity of hi l a v s . Number of Inverters for 
Saa 11 Noise . . .. ... ..•........... .. ................ ...•.. 
Prob•bility of lyl 2. a vs. N1111ber of Inverters for 
Large Noise ... . .. . ..................•....•........ ...... 
Ternary 
Example 
Totally 
Algebra Truth Tables ••..••..••..•••••..••... , •.. 
of a Static Hazard •••.••. .• .•••••.•..••.•.••..•• 
Self-Checking Module 
Metastable Detection Circuit ·•·•·••··•·••··•·•··•••··•• · 
T,ro Types of Bridging Faults ••• ••.•• . ••.••••..•.•• • • . • • • 
Poss i ble Circuit Impl mentation •.••.••.•....•..•.•••.••. 
Circuit laphmentation with Shared Logi c .. ••....• . ..•... 
Full Adder EJt .. pl• .... ......•.....•..•• . ........•.. . .. .. 
Fan! t Behavior of Full Adder •••• • •••••• •• .•••• •• •••••.•• 
Merger Di agram for Ful 1 Adder Example ••••••.•••••.•••• • • 
Vector ~D Example ... . ..............•...........•.... .. . 
Three Me thods of Check Vector Generation ... . ..•......... 
Two-Bi t Adder Example . . ••••.•••.•..•.••..•.•••.•• , .• , •. • 
Behav i or of Input-O put Faults in T,ro- Bit Adder .• • •..•. 
Page 
8 
10 
11 
16 
37 
39 
42 
50 
51 
H 
ss 
59 
62 
64 
67 
69 
74 
91 
94 
96 
106 
110 
118 
124 
13 0 
133 
135 
143 
146 
150 
157 
159 
174 
176 
1CHAPTER 1 
Introduction
i*i* Error Detection Strategies
As integration levels increase and more and more devices are placed 
on an integrated circuit, it becomes increasingly difficult to insure 
that a circuit and the system it is part of are operating properly. 
There are two basic approaches to this problem: off-line testing and 
concurrent error detection.
In off-line testing, the system is stopped periodically and a test 
procedure is performed. This test may be performed by the system 
itself, or an external tester may be used to stimulate the circuit and 
check its results. If the system successfully completes the test, then 
the assumption is made that the system is operating correctly. If the 
system fails the test, then the system is faulty. In this approach, 
since it is unknown exactly when the system failed, all computations 
performed since the last successful test procedure must be presumed 
erroneous.
The main advantage of using off-line testing is its simplicity. In 
most cases, only a very modest amount of additional on-chip hardware is 
required. Unfortunately, there are also many disadvantages. Because of 
the poor observability and controllability of VLSI circuits, it is very 
difficult to derive a test procedure that will completely test an entire
 
I'  
ro t  
1-1• &nu et ct  t ~ ~ 
s o l n or or i
it, es i lt
t it tem rt r t erl .
er i : in in
curr t t t . 
 , tem o r i l t
r  o . hi t a orm em
l , t l a im l t it
lt . tem ssf l plet s t,
n pti a t tem er t rr ctl .
tem i t, tem l . r ,
ct tem , putati  
form t ssf l r  ust esu
ron  
ai t n in plicit .
ost s, l odest lllllount it l i ar
i . nfortunatel , a t es. e s f
r ser abili ntroll bili SI it ,
i lt t r t ill plet l t t  
2integrated circuit. Often it is necessary to add additional logic on 
the integrated circuit to increase its controllability and/or observa­
bility [13* In addition, during the time the system is off-line for 
testing, it cannot perform any useful computation and thus system 
throughput is degraded. Since there is no way of pinpointing exactly 
when a failure has occurred# all results produced since the last suc­
cessful testing procedure must be discarded. Alternatively some type of 
check-pointing scheme can be used. This approach involves saving enough 
of the system state and data so that all computations performed since 
the last successful test procedure can be repeated. The most serious 
drawback of off-line testing# however# is its inability to protect 
against intermittent errors. It has been reported [2] that between 90 
and 98 percent of failures in computers are nonpermanent in nature. 
Off-line testing gives little if any protection against nonpermanent 
failures. Therefore, for any system in which we must immediately know 
when a failure has occurred (i.e.# any type of real-time system) or for 
any system in which we expect a major fraction of errors to be intermit­
tent, off-line testing is inadequate.
The second approach to this problem is concurrent error detection. 
In this approach# the system is divided up into one or more blocks 
called modules. The inputs (including both data and control vectors) 
and outputs of each module must be encoded with an appropriate code. 
Obviously# such encoding requires additional logic. These codes are 
selected so that when most failures occur# the result of a computation 
will either be correct or a non-codeword. Checkers are placed at the
 
it . ft it l i
n e it ntr ll bili / r -
i ]. it . r im em
, t orm f l putati tem
h o t r . i a f i t ctl  
h rr .  1 lt t o-
 l in r ust . lt r t
- oi t . i l
f e t t l putati orm
t essf l t r t , ost  
bact f n 1. ever. ili t t 
i st itt t rs. 1 r t
t u puter anent t r .
ff-l in  t t i st anent
. eref re. tem hi ust ediatel
he r ., tim e ) 
tem hi ect aj r io f ntel'111it-
t, f in n at . 
em curr t t t .
r ach, tem a   or
odules. t  n u t t t l t r )
t t  odul ust it r pri t e.
bviously, i i it l i . s
t he ost u r, lt putati
ill r t or , hecker t  
3output of each module. These checkers are used to detect non-codewords 
and thus indicate an error.
Concurrent error detection has several advantages. When an error 
occurs, the checkers immediately provide an error indication. With 
off-line testing, an error indication is only given after the off-line 
test procedure is performed. The lack of information concerning the 
precise time at which the failure occurred requires computations to be 
repeated. An immediate error indication eliminates the need to repeat 
computations. Protection is also provided against intermittent
failures. If an intermittent failure results in an error, it will be 
detected. Therefore, concurrent error detection is well suited for real 
time systems and any system in which intermittent failures are a signi­
ficant percentage of total failures.
The presence of checkers can greatly increase the observability of 
the circuit. If enough checkers are used, it is possible to completely 
or very nearly completely test a circuit simply by normal operation. 
Complete testing during normal operation prevents a buildup of 
undetected failures (the so-called "latent faults" problem). Since any 
concurrent error detection technique can only handle a limited number of 
failures, a buildup of latent faults can result in an error not being 
detected. If the checkers do not provide enough observability to detect 
all possible faults during normal operation, periodic testing must be 
used to detect any latent faults.
The major disadvantage of concurrent error detection is the addi­
tional logic required. The codes used for data and control vectors
 
t t odule. es er t t or
t r. 
oncurr t t o r l t es. h
rs, er ediatel i t . it
hi , o l f
t r . a n or at
i im hi u r i putati
t . ediat o i i t t
putati ns. r t t al so provided against intend tent 
. it t l r, ill
t t . heref re, rr t t o ell l
im e tem hi it t u i-
c t t l . 
r t n servabilit  
it . er , ssi l plet l
rl plet l  t . it l  al erati .
plet in al er t t i d
et t u th l t l l ).
curr t t o ec l l imi ber f 
, i d t l lt t
t t . er t i ser abili t t
l ssi l l r al erat , r i in ust
t t t lt . 
aj r a t curr t t i-
io l i i . t t l t r  
4 i
require redundant bits. Extra logic is needed to process these bits. 
Additional logic is also needed for checkers. The logic which must be 
added to implement concurrent error detection can be significant. 
Depending on exactly which concurrent error detection scheme is used* 
the additional logic required may be more than 100 percent of the origi­
nal system. Whether this type of extra cost is justified is obviously 
an engineering judgment. It is possible to use only concurrent error 
detection for those parts of the system which are either judged most 
likely to fail or whose failure would be most serious. Depending on 
what portions of the circuit are protected, significant savings of 
hardware are possible. A technique has been developed recently for 
various arithmetic computations [3]. This technique employs time redun­
dancy rather than logic redundancy. Although it is not applicable to 
all functions, time redundancy, where it is applicable, can provide con­
current error detection with only a very modest amount of additional 
logic but at the cost of additional time.
1.2. FapU Models
The purpose of a fault model is to describe the behavior of a phy­
sical failure in a manner that will allow us to predict the logical 
behavior of the failed system. Since in general, a physical failure 
affects the analog behavior of a circuit (i.e., gain, time constants, 
etc.) it may be very difficult to describe exactly how the failure will 
alter the logical behavior of the system. A fault model has three 
important attributes: accuracy, ease of analysis, and cost of fault
tolerance.
i a t it . %tr it .
dditi al i ers. hi ust
l ent curr t t t 1 i t.
ependi t hi curr t t he ,
it l i i a or h t i-
l e . hether y t st t e i sl
i ent. ssi l l urr t
t rt tem hi u ost
ik i hos oul ost s. ependi
hat rt it t t , i t
ar u l .  ec t
th eti putati ]. i e pl s im -
i e c . l t l l
l t s, im cy. her li l , i -
r t t it l odest ount it l
i t t st it l i e. 
,l , .&.ull odel  
lt odel i r -
l lu anne t ill low i t i l
avi r le e . i eral, ysi l
t i r it  •• i , im st nts,
. a i lt r t u Yill
i l i r e .  l odel
port t ib : r , al sis, st lt 
eran  
5If the fault model does not accurately describe the logical 
behavior of physical failures, then it is of little use. The quality of 
an error detection scheme is measured by the fraction of faults in the 
fault model which are detectable. Clearly, if the model does not accu­
rately describe the behavior of physical failure, this measure is of 
little use.
Two factors contribute to the ease of analysis of a fault model: 
the number of faults which must be considered, and the complexity of the 
fault behavior. Any system which contains many thousands of logic ele­
ments will also have a large number of possible faults. The behavior 
described by the fault model must be simple enough to allow analysis of 
the system. For off-line testing, we must determine whether the test 
procedure will detect each fault. For concurrent error detection, we 
must insure that the encoding used will allow detection of an incorrect 
result. If the fault model is too complex, this analysis will be too 
difficult to perform and the fault model will be impractical. One tech­
nique which can greatly reduce the number of faults is fault collapsing. 
Fault collapsing can be done when two or more faults are indistinguish­
able. Fault collapsing makes it is possible to reduce significantly the 
number of faults which need to be considered.
Cost of fault tolerance is a very important consideration since it 
strongly affects system cost. For off-line testing, cost of fault 
tolerance determines how large the test procedure must be. It may also 
influence the complexity of the tester hardware. For concurrent error 
detection, cost of fault tolerance determines how much extra logic must
lt odel t r t i l
avi r si l lure , . ali
t he eas r a io l
lt odel hi t t l . l arl , odel t -
r avi r ysi al , eas r
. 
t t l si lt odel:
lll!1ber l hi ust si r , plexit
l avior. tem hi t i a h -
ent ill 11.111ber ssi l lt . avi r
lt odel ust l low l si
e . r f , ust r i het er
r ill t t lt. r rr t t t ,
ust t i ill low t r t
lt. lt odel plex, l si ill o
i lt orm lt odel ill practical. -
hi t ber l l l .
ult l  h or l n in -
l . lt l akes ssi l i t
ber l hi si r . 
ost lt era port t si r t n
ro t em st. r in , st l
a i a t r ust . a
lue plexit are. r rr t
t t . st lt era i b uc t ust 
6be added to tbe original system. Cost of fault tolerance is usually 
highly dependent on the exact nature of the error detection scheme and 
the target system.
The selection of a fault model requires a tradeoff between accu­
racy* ease of analysis* and cost of fault tolerance. Since these 
requirements are usually conflicting* the choice is never easy. In the 
past* a variety of fault models have been proposed.
1.2.1. Single Stuck-At Fault Model
The single stuck-at fault model assumes that any physical failure 
will cause one node (wire) of the circuit to become permanently either a 
logic 1 or a logic 0. This model is extremely easy to use and is by far 
the most common fault model in use. It was first proposed when logic 
elements were built from discrete devices and is generally accurate in 
describing the behavior of failures in such devices [4]. Unfortunately, 
its accuracy is much poorer for the highly integrated logic elements 
which make up most of today's systems.
1.2.2. Unidirectional Fault Mpflel
The unidirectional fault model assumes that a failure causes any 
number of nodes in the circuit to be either stuck-at 1, or alternatively 
any number stuck-at 0. Smith [5] has shown that a unidirectional fault 
model implies the use of an unordered code (i.e., no codeword covers any 
other codeword) for concurrent error detection. He also showed that in 
most cases, concurrent error detection of unidirectional faults requires 
an inverterless implementation. Since nearly all logic families are
' 
ho r l . ost lt era al
l o t t t t t he
t . 
o lt odel i • ra f -
, f alysis, st l , i
re ent al nfli t a, i r . o
st, r l odels s . 
l•l- . 1l M. ult odel 
ai .lllld .&1 lt .llli2..lk..l u es t si l
ill i it anentl
i . ia odel re l
ost o l odel . as i
e ent er ilt rom i eral r t
o i r i ]. nfortunatel , 
•n r r i e ent
hi a ost ' s. 
-1-l - nidirecti al E ll odel 
i i t l . ..a.D.ll odel su es t
ber it - t , e
ber - t , it SJ o t i i t l lt
odel pli r ., r r
r or ) curr t t t . o b t
ost s, curr t t i i t l l i
rt pl entati n. o  1 i ili  
7inherently inverting, this restriction severely limits the usefulness of 
this fault model.
A related fault model which is quite popular assumes that any phy­
sical failure results in a unidirectional error at the module's output. 
In general, inverter-free implementations are not required to allow con­
current error detection for such a system. An unordered code, however, 
is still required. This fault model has been very popular for various 
structured elements such as memories and programmed logic arrays. We 
will refer to this fault model as the unidirectional error fault model.
1.2.,3. Bridging Fault Model
The bridging fault model assumes that a short between any two or 
more lines results in some sort of wired logic function. For NMOS and 
CMOS logic families, the wired logic operation is usually taken to be 
the AND operation. It is assumed that if any of the lines which are 
shorted together are a logic 0, then all the shorted lines will take on 
the value of a logic 0. If all lines have a value of logic 1, then the 
lines will retain a value of logic 1. Figure 1.1 shows an example of a 
bridging fault between two input lines resulting in a wired AND opera­
tion.
The behavior of a circuit under failure is much more complicated 
with this model than with the stuck-at fault model. A bridging fault 
results in an additional gate being added to the circuit. More impor­
tantly, a bridging fault can transform combinational logic into sequen­
tial logic. The bridging fault model is only useful for modeling shorts
 
t rt , io i it l
l odel. 
 e l odel hi it ul r su s t -
l lt i i t l t odule' t ut. 
eral, rt l entati t i o~ -
r t t e . r r e, ever.
i . hi  lt odel ul r
ru u e ent e ories ogram i s.
ill l odel i i t l m .Ll. lll odel  
!-1 -1- ri 1i 1 ult odel 
  i 2 i  1: lt o  I  ll]l1  t rt  
or in lt rt ir t . r
i unil , i i er t al a  
r t . su t in hi
i . l r in ill a
i . in f i ,  
in ill l i . i s pl
lt w t in l n i r -
i . 
i r f it er u u or pli t
it odel h it u t lt odel.  l
lt it l t it. or or-
t , lt ransform binati al i -
i . l odel l f l odeli rt  
F ( A , B , C)
F(A,B, C)
T
WIRED AND
Figure 1.1. An Example of a Wired-AND Bridging Fault.
A 
B 
C 
-------F(A,B,C  
A----~ 
B---~---1 
BRIDGING 
FA I LURE --> 
. 
C---~------l 
1' 
 
r-----
,  
i . . pl i d-AN ri ault, 
a 
9between lines. For this reason, it is usually combined with another 
fault model such as the stuck-at fault model.
1.2.4. Stnck-Open Fault Model
The st?gk~QP.£h fault is peculiar to MOS logic families. A stuck- 
open fault results from a physical failure in which some node in the 
circuit is prevented from having a DC path to ground or power for cer­
tain input combinations.
Figure 1.2 shows an example of a stuck-open fault in a CMOS NAND 
circuit. Due to a physical failure, the pullup transistor corresponding 
to input A is permanently in the nonconducting state. Whenever input A 
is a logic 0 and input B is a logic 1, there is no DC path from the out­
put node to either power or ground. The output node therefore remains 
at its previous value until the inputs are changed to re-establish a DC 
path to power or ground or until the charge leaks off the output node. 
The time required for a significant amount of charge to leak off the 
output node is usually much longer than the system clock period.
Static NMOS and PMOS gates are not subject to stuck-open faults. 
However, if pass transistors are utilized to implement certain logic 
functions, stuck-open faults can occur. Figure 1.3 shows an NMOS
inverter whose input is loaded by a multiplexer with two pass transis­
tors. The pass transistor corresponding to input A is permanently non­
conducting due to a physical failure. If the control input is a logic 
1, then there is no DC path from the gate of the inverter to power or 
ground (note that this path is normally provided by the gates that drive
 
in , r , al bi it t r
l odel - t l odel. 
.ll.W.1-rn lt uli r i ili s.  -
l l rom si l hi
it t rom t r er r-
t binati ns. 
i r . s pl u o l
it, ysi l , l sp
t anentl ducti . henever t  
t i i , t rom t-
t er . t t ai
til t l
er til e t t e.
im i i t ount e f
t t al u r tem ri . 
t t t t j t uc o lt .
owever, i t iz l ent r i
t s, u  l  ur. Figure 1.3 s s an NMOS 
rt r hos t o ulti l r it i -
. r i esp t  anentl -
ct si al . tr l t i
l h 11t rom t rt er r
t t t all t t i  
10
A
Figure 1.2. A CMOS NAND Gate with a Stuck-Open Fault.
10 
 
I 1--------L..-- -
_[ B 
Figure 1.2. A CMOS NAND G e with a Stuck-Open F l . 
CONTROL
A /\
OPEN
B
CONTROL
Figure 1.3. A NMOS Circuit with a Stuck-Open Fault.
11 
 
 _ Jt 
 
 l _ 
T 
_ 
i . .  i it i -O ult. 
12
inputs A and B). Once again* tie output remains unchanged from its pre­
vious value.
For both CMOS and NMOS circuits, a stuck-open fault can transform a 
combinational circuit into a sequential circuit (under failure, the 
present output is a function of a previous input). Therefore, the 
stuck-open fault model suffers from the same deficiencies as the bridg­
ing fault model. It is difficult to use because of the possibility of 
sequential operation. It also needs to be combined with some other 
fault model since it can only model transistors that have permanently 
failed in a nonconducting state.
13. Overview o l Research
The choice of a fault model is of crucial importance to any error 
detection scheme. Most of the fault models that have been proposed, 
were proposed long before the advent of large scale MOS integrated cir­
cuits. It is important that any fault model reflects the technology 
with which it is used.
We begin the presentation of our research in Chapter 2, by 
thoroughly reviewing the types of physical failures which are possible 
in present day MOS integrated circuits. We also examine the effects of 
scaling of device dimensions, voltages, and doping levels on the proba­
bility that a failure occurs.
In Chapter 3, models of several types of MOS inverters are 
developed. These models are used to study the effects of physical 
failures on MOS logic circuits.
 
n t  nc i . h t t ai rom -
l . 
r t it . uc o lt ransform
binati al it enti l it r . b
t t t t o t}. herefore.
uc o lt odel f rom f i -
lt odel. h i lt ssibili f
ential erati . bi it o
lt odel e l odel ra t anentl
le ducti t . 
.1.. ver ew Q.f e  
i lt odel i l ort
t t e. ost l odels t sed.
er o t a -
it . port t t lt odel t ech
it hi . 
e i o t t r hapt r .
h l e f si al u hi ssi l
t it . e i t
i ensi ns, l s. i l -
il t urs. 
b.apter , odels r l )f() rt
l . es odels u t ai al
i it , 
13
In Chapter 4, fault models are defined based on the results from 
Chapters 2 and 3. The techniques required to analyze a circuit's con­
current error detection capabilities are also developed in Chapter 4. 
Finally, the hardware requirements of implementing concurrent error 
detection for our fault model are examined.
The purpose of this research is to find fault models for MOS cir­
cuits which are better than the traditional fault models. We have 
already defined the criteria for judging a fault model: accuracy, ease 
of analysis, and cost of fault tolerance. The results of Chapters 2 and 
3 can be used to judge a fault model's accuracy for MOS logic circuits. 
The results of Chapter 4 are useful for judging the ease of analyses and 
the cost of fault tolerance for our fault model. The results of' this 
research show that fault models that are much more accurate than the 
traditional fault models are possible to use without sacrificing ease of 
analysis and cost of fault tolerance.
 
hapter , l odels l ro
hapter . e i l it' -
r t t abilit hapter .
i all , ar re ent l enti rr ~t
t r l odel i ed, 
in lt odels -
it hi ett io l l odels.
ea lt odel: r ,
al sis. st l . lt hapter
l odel' i it .
lt hapter f l f l
st l a r l odel, lt f'
o t l odels t uc or r t
io l lt odel  ssi l it t r c n
l si st l . 
14
CHAPTER 2
Physical Failure Modes for MOS Integrated Circuits
In this chapter* we examine the various physical failure modes for 
MOS integrated circuits. We restrict our study to MOS circuit technolo­
gies because of its wider use in VLSI circuits.
One important consideration in analyzing failure mechanisms is the 
effect of future changes in technology. One such change is called scal­
ing. Scaling is the process of reducing integrated circuit dimensions* 
doping* and voltages. Generally scaling results in denser integrated 
circuits that operate at a higher speed and consume less power. Most of 
the improvements in MOS integrated circuits over the past 15 years are 
due to' scaling. There is every reason to believe that in the future, 
devices will continue to be scaled even further. As a result* effects 
which were unimportant in the past, will become of much greater concern.
The simplest scaling scheme is to reduce all dimensions* both hor­
izontal and vertical* by a factor of K. Power supply voltages are also 
reduced by the same factor K while doping densities are increased by K. 
Because of this, the size of any device is reduced by a factor of K2 . 
Therefore, the number of devices which can be placed on an integrated 
circuit of a given size can be increased by a factor of K2 . The power 
consumed by a scaled device is also reduced by a factor of K2 and the 
propagation delay is reduced by a factor of K. Current density in con­
ductors, however, increases by a factor of K. This type of scaling is
-i 
ER.  
yaic•l i ode  ir it  
pter. i si l u odes
it . Y t t r u it e l -
i i er ~I it . 
port t si r t u e & s
t l , l-
, l 1 it ensi ns,
i , l s, enerall lt s e
it t er t  r ons111De er. ost
r ents e it r st r
• e l , er l .
i ill t r. lt, t
hi er i porta t st. ill o u t cer . 
plest n he l ensi ns, t r-
t l rti l. . er l l
I hil i sit X.
ec s i . i 2 ,
eref re. ber i hi e
it f  n  12 • er
u i i:2 
at  I, urr t si -
ct rs. -wever, I:. i  
15
referred to as constant field scaling since the magnitude of all elec­
tric fields remains approximately constant.
A majority of integrated circuits are designed to operate with a 
power supply of 5 volts. Since adding an additional power supply to a 
system is quite expensive, it is usually considered to be impractical to 
scale the power supply voltage. If all dimensions are reduced by a fac­
tor of K, but the power supply voltage is held constant, power per dev­
ice increases by a factor of K while current density increases by a fac­
tor of . If we take advantage of the fact that we can place times
as many devices on an integrated circuit of the same size, total power 
increases by a factor of also. Clearly, as devices are scaled down, 
power and current density will be of concern. This type of scaling is 
referred to as constant voltage scaling. Figure 2.1 gives a summary of 
the scaling factors for both constant voltage and constant field scal­
ing .
2 . 1 .  Interconnect Failures
Interconnect is that part of the circuit which connects transistors 
to other transistors and the input or output pads. Most MOS integrated 
circuit processes provide one or more levels of metal, a layer of 
polysilicon, and a layer of diffusion. All of these layers may be used 
for interconnect although polysilicon cannot be allowed to cross diffu­
sion since an unwanted transistor will be formed. If, however, the pro­
cess also provides for an enhancement transistor with a low enough 
threshold voltage, then the unwanted transistor may be made into an 
enhancement transistor. This allows polysilicon interconnect to cross
 
st t e n agnit l -
ai i at l st nt. 
 ajori it g er t it
er l lt . i it l er l
tem i ensi e, al si practi l
l er l lt e. l ensi -
, t er l l l st t, er -
n hil r t si -
1 3 • t t t K2 ti es 
a i it a , l er
13 , l arl , i n,
er r t si ill cer , i y n
e st t l l . i r u ar f 
n t st t l st t e l-
 
l, , erco t i  
erc t t rt it hi ect ra
t t t s. ost
it i or l etal.
l sil , . ll 1u  
n erco ct thou l si ico t low i -
ant ra ill . ever, -
i ent i it o
h l , h ant i a  
ent i t r. i lo l si c n erco t  
CONSTANT
FIELD
CONSTANT
VOLTAGE
DIMENSIONS r 1 XT1
VOLTAGE ET1 1
DOPING
CONCENTRATION K K
NUMBER 
OF DEVICES K2 K2
POWER PER DEVICE K”2 K
TOTAL POWER 1 K3
CURRENT DENSITY K K3
CONTACT RESISTANCE K2 K2
NORMALIZED CONTACT 
VOLTAGE DROP K K2
Figure 2,1 Summary of Scaling Factors
1, 
NS T S T 
I  L E 
I SI  l rl 
 1-   
PI  
E T TI  X: 
 
I  i:  i:  
 I  i::-  ( 
   1:  
 T  ( 1:  
 STAN  x:  ,2 
IZFJ>  
  I: x:  
Figure 2 .1 Sum y of Sc ing F o s . 
17
diffusion interconnect* although the capacitance of the polysilicon line 
and the resistance of the diffusion line is significantly increased. 
This in turn increases the delay of a signal propagating on either line. 
The ability of polysilicon to cross diffusion is very important when 
only one layer of metal is available. As we will see in the next 
chapter, a sufficiently low enhancement transistor threshold voltage 
will have an impact on system performance.
2.1.1. Metal Interconnect Failures
It is well known that any metalization subjected to a high current 
density is susceptible to electromigration [6] . Electromigration typi­
cally occurs where there is a slight constriction in the conductor. The 
current density is highest at the constriction. The high current den­
sity causes metal ions to diffuse away from the constriction. This dif­
fusion further narrows the conductor which in turn raises the current 
density and thus continuously accelerates the process. Eventually, the 
conductor fails. Lines subjected to DC current are most susceptible 
while lines subjected to AC current are essentially immune to electromi­
gration. Nanosecond pulses of current (all pulses of the same polarity) 
two orders of magnitude higher than the DC case may be safely carried by 
metal conductors. CMOS and dynamic NMOS circuits which dissipate no 
static power are thus less likely to suffer electromigration failure.
A variety of factors affects the mean time to failure of a metal 
line. These include materials, grain size and orientation, and relative 
width and length of the conductor [7]. The most important factor, how­
ever, is current density. The mean time to failure for a line is given
 
o erc ect , thou i l si co in
tan u o in i t n .
i l at in .
il l si c u o port t
l etal il le. ill t
pter, f t o ent ra l
ill act tem a ce, 
l , , , ~ erco t ai  
ell tno t ~ t l o j r t
si ti l e ro i r t ] l i rat i-
l r her t str o ductor.
r t si st stri t . r t -
etal rom stri t . i i -
o o uct r hi r t
si t sl l ess. ventua l ,
ct r i . i j r t ost ti l
hil in j t ti l m e tro i-
t . anos  l r t l l a l ri
r agnit r
ets! ductors. i i hi i t
er ik e ro i r t . 
 ri f t im  et l 
. aterials, t t , v
i t uct r . ost port t r, -
er, r t sit . o e im u in  
18
by the formula:
M T F  = . J"N exptfj/T)
where an(j X2 are constants* J is the current density* N is a material 
dependent constant, and T is temperature. The value of N for aluminum
is generally considered to be 2 (there is some disagreement on this 
point, see [7]). Therefore, the mean time to failure is inversely pro­
portional to the current density squared and exponentially related to 
the reciprocal of temperature. For this reason, scaling will have an 
important (and unfortunately negative) impact on the reliability of 
metal conductors. If the power supply voltage is not scaled (constant 
voltage scaling), we have already stated that both power consumption and 
current density will increase by a factor of as the dimensions are 
scaled by a factor of K. Due to the current density alone, the mean 
time to failure for aluminum will scale down by a factor of . Since 
we also increase the number of metal interconnects by a factor of by 
scaling, then the mean time to failure for the entire integrated circuit 
will decrease by an additional factor of . Therefore, ignoring the 
effects of temperature, we can expect the mean time to failure of an 
entire integrated circuit to decrease by a factor of K* if aluminum 
metalization is used.
The temperature that an integrated circuit operates at is highly 
dependent on power consumption, packaging, and external cooling. Reduc­
ing the temperature by packaging improvements or adding external cooling 
tends to be expensive. Therefore, if power consumption is increased by 
a factor of during the scaling process, it is reasonable to expect at
 
ula: 
lrITF • 1 1 • .r- (~  
her 11 d 1  st t  • .T r t nsit .  ateri l
e t stant, i e perat re. o  lumin'Oll 
eral h a sagree t il
i t, ]), eref re. e im u -
rt l r t si nentiall e
l e perat r . r . e n ill
port t f rt at l ati ) act i f
etal uctors , er l l t st t
l l ). rea t t er llJlpti
r t si ill t 3 e si
I:. r t si , e
im um nwa ill ,r  r , i
'lllllber etal n erco ct r2 
l , e im ti e it
il 1 r it l il eref re. g
t f perature. ct e im
t it 1:8 llllDi
etali t . 
o e perat r t e it r t
e t er mnpti . gi g. t l l , e uc-
e perat r i r ent t l l
e ensi e. herefore, er sUBSpti n
( 3 r ess, l ect  
19
least some increase in integrated circuit temperature. Since the rela­
tionship between temperature and conductor lifetime is exponential, 
relatively small increases in temperature will drastically reduce mean 
time to failure. One should note that if the power supply voltage is 
scaled along with the dimensions, the integrated circuit mean time to 
failure only decreases by a factor of Also, since total power 
remains constant, the integrated circuit temperature should also remain 
constant.
Accumulation of metal from electromigration presents another prob­
lem. This metal can form hillocks or whiskers [6]. Whisker formation 
tends to occur where there is a high electric field between conductors. 
The formation of hillocks and whiskers can result in shorting between 
adjacent metalization and cracking of the passivation level.
Ohmic contacts are formed where metalization must provide an 
electrical connection to a diffused area. Ohmic contacts ideally should 
produce no rectification or other asymmetry in the response to positive 
and negative waveforms. In addition, the resistance of contacts should 
be as low as possible. Ohmic contacts are used extensively in 
integrated circuits. Unfortunately, they appear to be a major problem 
area for future integrated circuits. Since the resistance of a contact 
is proportional to its area, the contact resistance will increase by a 
factor of K2 during scaling. If supply voltage is reduced by the pro­
cess of scaling, then normalized contact voltage drop (i.e., signal vol­
tage divided by supply voltage) increases by a factor of K2 . If the 
power supply voltage remains constant, then normalized contact voltage
 
t e it e perat r . -
io e perat r uct r ifetim onential,
iv all e perat r ill st l e
im . l t t er l l
it ensi ns, it 111ean im
l c• . ls , l er
ai st nt, e it e perat r l ai
 
ccu ulati etal rom ro i r t t t r -
e , i etal rm lo his er { ]. hisker o t
r her t uctors. 
at i lo his e lt rt
t etali t t el. 
runi t ct rm her etali t ust i
t l ect fn . i t ct l l
t ic io etr sit
at avef r s. it , tan t t l
o ssi l . i t t l
it . nfortunatel , r aj r em
it .  ta t ct
ort l . t t stan ill
2 r e l . l l -
l , ali t t l ., l l-
a l l )
er l l inai st nt, ali t t l  
20
increases by a factor of K (see Figure 2,1) „ Any mask misalignment dur­
ing processing will aggravate this situation since the effective area of 
the contact will be further reduced.
In addition to these scaling problems* a variety of effects due to 
electromigration can also lead to failures [6]. At fairly high tempera­
tures* it is possible for silicon to leave the substrate and form an 
alloy with the aluminum. This depletion of silicon decreases the effec­
tive junction depth and thus makes it easier for spikes of the 
aluminum-silicon alloy to extend through the junction and into the sub­
strate. This results in a short from the metal and diffusion to the 
substrate. It should be noted that junction depth is one of the dimen­
sions which is reduced in the scaling process. Thus, scaling makes it 
easier for a spike to penetrate past the junction. The metalization of 
the contact is also susceptible to electromigration resulting in open 
contacts [8].
For most integrated circuit processes, metal forms the top layer. 
For this reason* the metalization will tend to be three dimensional as 
it crosses over features on lower layers. Any time metal has to go up 
or down steps on the surface of an integrated circuit* there is the pos­
sibility of either a break in the line, or a constriction. Obviously 
such a constriction is a prime site for electromigration to occur. A 
defect in the metalization mask can cause either a short or open in the 
metal interconnect depending on the defect.
Many of the metals used in integrated circuits are subject to cor­
rosion (particularly alnminum) and accelerated electromigration from any
 
n I i r . ) , as i ign t r-
ill r at tu io
t t ill . 
it n s, r t
ro i r t lu ]. t e er -
. a ssi l lico st orm
lo it u i . hi l t ico -
iv UD.cti t akes i r
u i -sil o hro t  -
, hi lt rt rom etal fu o
strat . l t t t t -
hi ess. b.us, akes
i etr t st t , etali t
t t ti l e trom.igrat in
t t ]. 
r ost e  it esses. etal r.
r . etali t ill e ensi al
r o er rs. im etal
r  it, o s-
i , r stri t . bvi usl
stri ro i r t .r.  
f t etali t as rt
etal erco ct i f ct, 
a etal e it j t r-
r l u ) l ro i r t rom  
I 
I • 
21
moisture or other contaminants [9]. Ideally, packaging should provide 
an almost impervious barrier to such contaminants. If the packaging 
should fail to perform this task, all metal on the integrated circuit is 
subject to failure.
2.1.2. Pg,lys_i,ll.g.Qn Interconnect Failures
Polysilicon also appears to be vulnerable to electromigration [103. 
The physical mechanism, however, seems to be somewhat different than for 
metal. In polysilicon, a high current density usually causes the dopant 
atoms to migrate rather than the silicon atoms. This migration results 
in areas with a lower concentration of dopant atoms. The resistance of 
polysilicon is very sensitive to doping levels. Therefore, the resis­
tivity of the polysilicon increases in the areas where electromigration 
has left low concentrations of dopants. This leads to the formation of 
local hot spots which can further accelerate the electromigration pro­
cess. Eventually, thermal runaway causes the line to fail. It should 
be noted that at extremely high temperatures which can occur with ther­
mal runaway (temperatures greater than 1000 °C have been observed in 
polysilicon test structures [10]), silicon atoms start to migrate as 
well as dopant atoms. Usually silicon migration only occurs immediately 
before conductor failure.
It appears that electromigration becomes an important source of 
failures at current densities of 10^ A/cm^ [11] (approximately the same 
as for metal lines). Fortunately, when polysilicon is used as intercon­
nect, it will seldom be subjected to DC current densities of this magni­
tude. It must be kept in mind, however, that just as for metal, current
 
oist r r t i ant ]. ll , i l i
ost er i rri i t  i
l i orm t. ll etal e it
j t . 
1-l •l• ol si icon erc ct ail  
l si c ar l l ro i r t 1 ). 
si l ec ani , ever, e hat t
etal. l sil , t si al ant 
o s igrat lico o s . i igrat lt
it o r centr t t o s. stan
l si c si i l . herefore. -
l si c her ro i r t  
t o centr t ants. i  at
l t t hi l r t ro i r t -
s. entua l , al in il l
t t t e l e perat r hi r it -
al  e perat r t r
l si c  t ru ]), lico o s t igrat
ell a t o s . suall lico igrat l r ediatel
uct r . 
ar t ro i r t es port t f 
u r t sit 6 / 2 ) at l
etal i ). rt atel , l si c c -
ct. ill ldom j r t nsit agni-
. ust t ind, ever, t t etal, r t 
22
density scales by a factor of K or , depending on wbicb scaling rules 
are used. Contacts between metal and polysilicon are subject to the 
same type of difficulties as the metal-diffusion contacts we have 
already discussed. Once again* since the currents will tend to be lower 
than for metal-diffusion contacts* there should be fewer problems.
2.1.3. Diffusion Iat9?.S.<lhh-3.<?t F&i.lPtSS
When diffusion areas are formed on an integrated circuit, we are 
depending on a reverse biased pn junction to prevent the diffused area 
from shorting to either the substrate or other diffused areas. With a 
properly designed and manufactured integrated circuit, the breakdown 
voltage of the pn junctions is well above any voltage difference which 
the circuit will be subjected to. It is possible for various anomalies 
to result in significantly lower breakdown voltages. Possible causes 
include local crystal defects, changes in doping concentration, exposure 
to radiation and excessively shallow diffusions. Radiation can also 
increase the leakage current of a pn junction. Leakage current also has 
an exponential dependence on temperature. Regardless of the cause, if 
the pn junction should break down or if leakage current should become 
excessive, the diffusion area will become shorted to the substrate.
It is also possible for two closely spaced diffusion areas to 
become shorted together. This occurs if the depletion regions of the 
reverse biased pn junctions should happen to overlap. In this case, 
charge carriers in one diffusion area will be swept by any potential 
difference across the overlapped depletion regions, to the other diffu­
sion area. The width of a depletion region is approximately
 
si 13• i hi h n
. ontact etal l si  j t
y i nl etal- i t ct
rea . i . r t ill e o r
h etal- i tacts. l r l s. 
1, ,1, if n erconnec ai uros 
he  o orm it,
i t t fu
rom rt l st r fu s. it
r anufact r e it, eakdo
l t ell l fere hi
it ill j . ssi l ali
lt i t o r eakdo l s. ssi l
n u l st l f cts, i centrati , r
o s low i s. adiat
ea r t t . r t
nenti l perature. egardl se,
t l ea r t l
essi e, o ill o r str t . 
ssi l o :r
r t er. b.is r l t
t l erl . ,
r a o ill t t nti l
ere a l t i s, i -
, i t l t i at l  
23
proportional to the reverse bias voltage. Therefore, the depletion 
regions of two adjacent diffusion areas are most likely to overlap when 
they are at their most positive voltage (most negative voltage for 
PMOS). In this case, however, both areas will be at the same potential. 
Snch a short should have little effect except on circuits which are 
highly dependent on the relative capacitances of nodes (snch as dynamic 
circuits). A more serious although less likely problem occurs if the 
two adjacent areas are at significantly different potentials. In this 
case, if the depletion regions overlap, it will be possible for signifi­
cant currents to flow between the two areas. It is becoming more common 
to use a recessed field oxide. Recessed field oxide has several advan­
tages which include lower capacitance and improved surface planar ity. 
In addition, since a pn junction is only formed at the bottom of a dif­
fusion area, it is virtually impossible for the depletion regions of two 
adjacent diffusion regions to overlap. If an insulating substrate is 
used, then isolation failures should not be an issue.
It is often the case that interconnect will run over the oxidized 
substrate between two diffusion areas. The result is a parasitic MOS 
transistor. The diffusion areas form the source and drain while the
interconnect forms the gate. If the parasitic MOS transistor is allowed 
to turn on, an unwanted current flows between the two diffusion regions. 
In other words, the diffusion regions are shorted together by the 
parasitic transistor. To prevent this from happening, the field oxide 
is made thick enough to prevent the parasitic transistor from turning 
on. Similarly, the substrate under the field oxide is often implanted 
to make a channel even harder to form. If enough charge (due to
I 
1  
ort l i l . erefore, l t
j t o ost ik r he
i ost sit l  ost at l
S). , ever, t ill t a t nti l .
rt l t t it i
ent v ci su i
i . or thou ik e r
j t t i t i t tentials. i
. l t erl . ill ssi l ifi-
t r t low s. i or o
i . eces i r l -
hi o r i r l ari .
it , t l orm t to f i -
, i l possi l l t
j t o erl . n st t
. h o  u l t . 
t erco t ill r i
st t . lt rasit
si t r. he if i  r s f  t  rce  r i  hile t  
erco t t . rasit i low
. ant r t lo w o i s .
t r ords, fu r t r
rasit i t r. t rom e i , i
a t r sit r i rom
. i ilarl , str t r i l t  
a el r r .   
24
radiation* mobile ions* etc.) becomes trapped in the field oxide* a 
channel may still form* especially when the interconnect is at its most 
positive voltage (most negative voltage for PMOS) [12].
2.1.4. Dielectric Failures
In B«)S integrated circuits* silicon dioxide (SiC^) is the most com­
mon dielectric* although silicon nitride (Si^ ify) is also used occasion­
ally. The dielectric material is used for two important purposes: insu­
lation and protection.
The dielectric must separate any two conducting layers from each 
other. One very important use of a dielectric is in the gate oxide 
which insulates a transistor's channel from its gate electrode. Almost 
all BTOS circuits depend on the extremely high gate impedance of a BIOS 
transistor. The smallest pinhole in the gate oxide can result in a 
short from the gate electrode to either the source diffusion* channel* 
or drain diffusion (depending on where the pinhole is). Gate electrodes 
which are connected to Input/Output pins are of particular concern. 
Simply handling an integrated circuit will subject the pins to electros­
tatic discharge. Three sources of electrostatic discharge as reported 
in [13] are:
(1) A charged person touches a device and discharges the stored 
charge to or through the device to ground.
(2) The device itself* acting as one plate of a capacitor* can 
store charge. Upon contact with an effective ground the 
discharge pulse can create damage. 3
(3) An electrostatic field is always associated with charged ob­
jects. Under particular circumstances* a device inserted in 
this field can have a potential induced across an oxide that 
creates breakdown.
1  
i t , obil s. . es rap i ,
el a . ci l h n erco t ost
sit l ost at l S) ]. 
l•l•i• Di l tric Failures 
MO o it . 1 ico o%i 02 ost -
i l ctri . thou lico i d i3N4) a i -
.  i l t ateri l port t r ses: -
io t t . 
i l t ust t w ct 1 a rom
t r. ort t t h t i
hi l i t r' el rom t , l ost
l M it re l s s M
i t r. a l st l t i lt
rt rom t ro , nel.
o her l . at ro
hi ect t t t rt l cer .
l l e it ill j t -
e. r t st t c r
: 
1) A charged person touches a device and ischarges the stored 
hro i . 
i l , n l acit r.
r e. t t i
s l t age.
t t t i i s -
t . nder rt l rcu st ces, i
e t nti l t
n. 
I . 
25
Clearly, electrostatic discharge is not limited to situations where the 
device is being handled. It may also occur while the integrated circuit 
is in use. An electrostatic discharge can easily generate a potential 
difference of 1000 or more volts [13] . Due to the high input impedance 
of a MOS transistor, there is no way for the static charge to leave the 
gate electrode. Since the gate oxide typically has a breakdown voltage 
on the order of 100 volts or less, electrostatic discharge leads to 
breakdown of the gate oxide. Since the gate oxide thickness is typi­
cally reduced during the scaling process, it is reasonable to expect 
gate oxide breakdown to occur at even lower voltages. For silicon diox­
ide, this breakdown is permanent resulting in either a resistive short 
or a diode short between the gate and source, drain, or channel. The 
type of short is determined by whether the gate is of the same type or 
opposite type as the material it is shorted to [14] . If both materials 
are of the same type, the short will be resistive. If they are of oppo­
site types, the short will be a diode.
Due to the susceptibility of MOS to electrostatic discharge, it is 
standard practice to use protective circuits on all Input/Output pins. 
Many different circuits have been proposed, but they typically use two 
diodes (or the functional equivalent). These diodes are biased so that 
any time the pin voltage goes significantly outside the range of ground 
to power supply, one of the two diodes conducts providing a path for 
charge to leak off the gate. Even though such circuits lower the proba­
bility that electrostatic discharge will destroy a transistor, they do 
not provide complete protection.
 
l arl , t t t t imi tu io her
i l . a r hil e it 
. t st t i er t t nti l
fere or lt ]. t
si t r, e
t . t s i l akdo l
r   . t st t e
eakdo t i e. i t i c i-
e r ess. l ect
t i eakdo r o r l s. r lico -
, akdo anent l in iv rt 
rt  t t r , i , nel.
rt i het er t
osit ateri l r ] t aterial
a , rt ill . -
s, rt ill e. 
eptibili t st t ,  
a r t r t t it l t t t i s.
a i t nit s . t i l
C r t l i alent). es t
im l i t t
er l , uct i t
e t . h n b it o er -
i t t t t ill i t r,  
t i plet t t . 
26
Studies have examined the susceptibility of gate oxide* both with 
and without protection circuits* to electrostatic discharge[14,15] . In 
both cases* the failure mechanism appears to be cumulative. That is* 
the more stress the oxide has been exposed to in the past* the higher 
the failure rate.
As we have previously discussed* electromigration may result in the 
accumulation of metal which can crack dielectric layers. Another possi­
ble source of failure is due to differences in the coefficient of expan­
sion of the dielectric and substrate or interconnect.
Usually* one of the last steps in fabrication before dicing and 
packaging is covering the integrated circuit with a thick layer of 
dielectric material. This layer is called the passivation layer and 
along with the other packaging is responsible for protecting the 
integrated circuit both mechanically and chemically. It must protect 
the surface of the integrated circuit from scratches during the packag­
ing procedure and seal out any moisture or other chemicals which could 
cause corrosion of the metalization. In addition* it must prevent ions 
from diffusing close to the substrate. Any such ions can change the 
threshold of a transistor or allow the substrate under the field oxide 
to invert. If metal interconnect crosses any two diffusion areas* A 
parasitic MOSFET is formed. Normally this transistor will be off. If 
the substrate under the field oxide should invert, then the MOSFET is 
turned on and the two diffusion regions are now shorted together by the 
parasitic MOSFET.
, 
t i u:a i eptibili t i e, t it
• th t t t it , t st t , ],
t s, u ec a i r ulati e. at h,
111ore o st, r
t . 
s sl , ro i r t a lt
c11JDulati etal hi o t r . not er ssi-
l r ere ffi t -
i l t st t e c ect. 
sua l , t e abric& o
i h r s it it  
i l t aterial. i t a
o it i si l t t
e it t echanicall ica l . ust t t
it rom r -
r l t oist r r ical hi l
f etali t . it . ust t
rom fu n str t . o
h e i lo st t r i
ert. etal erco t w fu o s,
rasit S . or all ill ff.
str t er e d l ert,
o r o r o
rasit  OSFET. 
27
2.2.. Transistor Failures
Transistors are responsible for providing the switching action 
which allows a circuit to implement a Boolean function. There are a 
variety of parameters which control the operation of a transistor. Any 
change in these parameters affects the ability of circuits to perform a 
desired switching operation. If a transistor is allowed to break down, 
uncontrolled currents will flow through the transistor. This also leads 
to circuit failure.
2.2.1.. Parameter Shift Failures
The two most important parameters of a MOS transistor are threshold 
voltage and transconductance. The threshold voltage is the gate to 
source voltage which causes an enhancement mode transistor to go from 
the nonconducting state to the conducting state. The transconductance 
is a measure of how much the transistor's conductance changes due to a 
change in the gate to source voltage. Transconductance is defined as 
the partial derivative of drain current with respect to gate to source 
voltage. Both of these parameters are of great importance to the tran­
sient and steady state responses of MOS logic circuits.
An important source of parameter shifts in a MOS transistor is hot 
electron injection. Electrons in a high electric field can be 
accelerated to a very high velocity. Because of the direction of the
electric field in the area of the channel pinch-off region, any hot 
electrons generated in this area will be directed toward the gate oxide. 
Some of these electrons will have a sufficient energy to overcome the 
potential barrier between the silicon and silicon dioxide. Of these
 
1-1· r sist r i n  
r sist r  si l i i o
hi lo it l ent ol t . er
ri eter hi tr l r t r i t r.
eter t il it form
i i erati .  ra i low n,
contr ll r t ill low h o i t r. i
it . 
l-1-1, eter ift i  
ost ort t eters  ra h e
l ranscon an . h e l t
l hi ent od rom
ducti ucti . ransco
eas r uc i t r' ct
t l . r ct
rti l r t r t it t t
lt e. ot eter t rt ra -
t ea i it . 
port t r et i i t
ro . l t t
l l cit . Because of the direction of the 
t el - ff , t
er t ill o r t i .
il 1 t co
t nti l rri lico lico i . f  
28
electrons, a fraction will be trapped in tbe oxide as the remaining 
electrons proceed to tbe gate electrode. Whether or not an electron 
will enter the oxide and the fraction of such electrons that become 
trapped depends on a variety of factors. Such factors include tempera­
ture, electrode potentials, doping levels, and device dimensions 
[16,17]. The buildup of negative charge in the gate oxide will eventu­
ally cause a shift in both threshold voltage and transconductance 
[18,19]. Scaling will increase the likelihood of hot electron failures. 
If constant voltage scaling is used, the higher electric fields will 
increase the number of electrons injected into the oxide. If constant 
field scaling is used, circuits will be more sensitive to parameter 
shifts.
Mobile ions can be introduced during processing or by a packaging 
failure. These ions will move in response to electric fields which will 
result in threshold voltage and transconductance varying with age. 
Moisture in the passivation layer has been found to cause similar 
results [20].
Another cause of parameter shifts is exposure to ionizing radia­
tion. Snch radiation can be of many different forms including X-rays, 
alpha particles, cosmic rays, and high energy sub-atomic particles such 
as electrons, protons, and neutrons. The effects of such radiation 
includes damage to the crystal lattice, photo currents, and most impor­
tantly, the accumulation of static charge in the oxide [21,22]. This 
charge leads to threshold voltage shifts and decreases in transconduc­
tance. It has also been shown [23] that radiation can increase the
1 
s, io ill rap h i e ai i
h s rod hether t o
ill t i o io o t
rapp ri r . n u e er -
, ro tentials, i ls, i e si
, ]. i at ho t Oii ill t -
 i t t l ransco
, ]. l ill o ikeliho t ro .
st t l , o t e ill
o ber r i e. st t
e i , it ill or si eter
i , 
obil ntrod i
. ill o t e hi ill
lt l ransco it e. 
oist r i t il r
lt ]. 
not er l)ar oter i o o n i -
io . u o a t or u -r s.
rti l s. i s, o i rt
s, t s, t s. t o
a o t l , t r ts, ost or-
t , o ulat i 1. i
e l i ransco -
. ho ( t o  
29
noise level in transistors long before any shift in threshold voltage or 
transconductance is observable.
2.2.2. Breakdown Failures
MOS integrated circuits are subject to a variety of breakdown 
mechanisms. The drain of a MOS transistor forms a reverse biased junc­
tion with the channel. One limit to maximum power supply voltage is the 
breakdown voltage of the drain channel junction. Another type of break­
down is punch—through. Punch—through occurs when the drain depletion
region extends all the way across the channel to the source depletion 
region. Punch-through results in a large uncontrolled current flowing 
between drain and source.
Another source of failures is due to parasitic bipolar transistors.
A NMOS transistor has a parasitic lateral npn bipolar transistor. The 
collector and emitter are formed by the source and drain areas while the 
base is made up of the channel. A substrate current caused by impact 
ionization will eventually lead to a voltage drop between the substrate 
and source. This drop forward-biases the emitter-base junction of the 
parasitic bipolar transistor, which turns on the bipolar transistor 
causing drain breakdown at a much lower voltage. Short channel devices 
aggravate the situation. The shorter the channel, the more efficient 
the bipolar transistor will be due to the thinner base. It has been 
reported [24] that trapped charge in the gate oxide can make the bipolar 
transistor easier to turn on. Since the current flow due to bipolar 
action increases hot electron injection, this is a regenerative process.
 
i l ra o r i t h e l
ransco s r able. 
l-1·1• r i  
n e i j t i
echani s. ra or -
io it nel. i it a m er l l
eakdo l el t . not er -
ch- ou unch-t h ccurs hen t e rai  epleti  
l a el l t
, c - lt a ntr l r t lo
r e. 
not er rasit i l i t rs. 
i rasit l l r i t r.
l itt r orm hil
a nel.  st t r t n:pact
o ill t al e l st t
r e. i r - i it t
rasit i l r i t r, hi i l r i
uc o r l . ort el i
r t u . rt nel, or t
i l r ra ill s .
( 1 t rap t i a e l
ra i . r t low l r
o n t o . er t ess. 
30
Latchup is a similar, although more serious problem, that can occur 
in bulk CMOS circuits. An n-tub CMOS process results in a lateral npn 
parasitic bipolar transistor (as in the NMOS case) and a vertical pnp 
parasitic bipolar transistor. Together, these two transistors form a 
npnp semiconductor controlled rectifier. If the product of the two 
parasitic bipolar transistor's current gains exceed 1, then a transient 
pulse or exposure to radiation may result in the semiconductor con­
trolled rectifier turning on. This results in a large current flowing 
from power to ground. If this current is large enough, the circuit may 
be damaged. A thorough discussion of the transient conditions necessary 
for latchup is given in [25].
2.3. Rad.is.ti.ojy-Ijadn.c.g.4 Soft Faiiassj.
Soft failures are random non-recurring errors. These errors are 
caused by radiation striking integrated circuits and generating 
electron-hole pairs. The failure rate will depend on the amount of 
radiation striking the integrated circuit at any given time. By con­
trast, the radiation failure modes discussed previously depend on the 
total dose the integrated circuit has received. The higher the dose, 
the more the circuit is damaged. Soft errors, however, are caused by 
euccess carriers, not damage to the device. Since dynamic devices are 
n-ote-T&estoring, they are most susceptible to soft errors but static cir­
cuits may also be affected by high radiation environments.
The two major causes of soft errors are alpha particles and cosmic 
rays. The alpha particles are due to small amounts of radioactive 
material (usually uranium or thorium) in the packaging. The radioactive
 
t  il r. tho or o , t r
l it . l  l
rasit l r ra i o o) rt l
rasit l si t r. get 1 o ra orm
i uct r t l ti r. ct
r sit l i t r' r t i %ce , ra t
l .zposur o a lt i uct r -
r le t . i lt a r t lo
rom er , r t 1 h, it a
aged. h o o ra t it
atch ( ). 
1 1. adiation-I duced~ ailures 
ft u ando -r r r . es
i o ikin i er t
ro l ir . o ill ount
io rikin n e it i e. -
t. o odes i
l it . r se,
or it aged. ft r , 01Jever.
.x rri r , t a i . i i
D<llt:"""Destori g. h ost ti l ft t -
it•~ io i ents. 
aj r ft rt i
s. p rt al 1 ounts t
ateri l { all aniUJD h iu ) agi . t  
31
material emits high energy alpha particles. If these particles are gen­
erated close enough to the surface of the integrated circuit, they will 
enter the substrate and generate electron-hole pairs which can then be 
collected by a reverse-biased junction. A thorough discussion of 
electron-hole pair generation and subsequent collection is given in 
[26]. Information in dynamic circuits is represented by charge stored 
on a node. Therefore, excess carriers generated by ionizing radiation 
can erase information stored in the circuit. In the scaling process, 
the amount of charge used to store information is reduced. An error 
only occurs if the amount of excess charge generated is at least of the 
same order as the amount of charge used to store information. There­
fore, the scaling process will make circuits more susceptible to soft 
errors. Steps can be taken to protect a circuit from alpha particles 
[273 . Unfortunately, it is very difficult to shield an integrated cir­
cuit from cosmic rays.
I 
,.... 
' I 
 
ateri l it rt . rt -
e it, h ill
t st t er t ro l i hi h
l t . h o
ro l i er t t l o
{ 1. at i it
e. eref re, r r o o
o at o it. ess, 
ount n o at .
l r ount er t t
r ount n ati . er -
, s ill a i or ti l ft
. t a t t it rom rt
] nfortunatel , i lt i -
it rom i s. 
32
CHAPTER 3
Behavior of Failed Circuits
In order to develop an accurate fault model* it is necessary to 
have a good understanding of the behavior of circuits that have failed. 
In addition* if it is possible for a failed circuit to produce an output 
which is not a valid logic value* then we must also have an understand­
ing of the behavior of a good circuit given such invalid logic values as 
inputs. In this chapter* we develop an understanding of both cir­
cumstances.
3.1. Summary &£ Failmg. M.e..ohjftnisffs
In Chapter 2* we arrived at the following list of possible failure 
mechanisms:
(1) Interconnect failures: Opens in metal and polysilicon lines 
due to electromigration. Shorts between metal lines due to 
electromigration. Shorts between diffusion lines due to junc­
tion failure and parasitic field transistors. Shorts between 
diffusion lines due to junction failure and parasitic field 
transistors. Shorts between diffusion contacts and substrate 
due to spike formation. Open polysilicon contacts due to elec­
tromigration. Shorts between diffusion and substrate due to 
junction failure. Shorts between metal or polysilicon and other 
interconnect layers (including transistor channels) due to 
dielectric failure.
(2) Transistor failures: Parameter shifts due to hot electron 
injection* radiation exposure* and exposure to contaminants. 
Increased noise due to radiation exposure. Drain breakdown due 
to junction failure and parasitic bipolar transistors. Latchup 
due to parasitic bipolar transistors. 3
(3) Soft failures: Soft failures due to ionizing radiation and 
other environmental sources of interference.
 
B I'  
ehavi r i ir uit  
r r t l odel. s
er i r it t .
it . ssi l e it t t
hi t l i l . ust ers -
avi r it l i l
ts. apter. er t -
stances, 
i . muna~v tl i ure echa i m  
hapt r , iv t low ssi l
ec sm  
rc ct lures: pe s etal l si ico in
ro i rati . ort etal in
ro i rati . ort o in -
io u rasit e i r . r !I
o in t rasiti  
i t rs. ort o t t st t
ati , l si c t t -
ro i rati . ort fu o st t
t . ort etal l si c r
n erco t a in u nels)
i l t . 
r sist r lures: et i o t ro
, i z sure, r t inants,
i o sure, r i eakdo
t lu rasit l r i . at
rasit i l r i rs.
fu2..f.1 failures: f fa lur to io zin ra io
r i o ental . 
I 
,,.. 
33
Two prior research studies have evaluated the likelihood of partic­
ular failure mechanisms for integrated circuits. Galiay et al. studied 
the failures of a 4-bit microprocessor [28] . The microprocessor was 
fabricated using a metal gate PMOS process. Failed microprocessors were 
examined under an optical and scanning electron microscope. The 
microprocessors were also probed directly. The study found the follow­
ing distribution of failures:
Short between metallization 39% 
Open metallization 14% 
Short between diffusions 14% 
Open diffusion 6% 
Short between metallization and substrate 2% 
Inobservable [sic] 10% 
Insignificant 15%
The failures labeled inobservable [sic] were those failures which 
resulted in incorrect behavior but for which no physical failure could 
be found. Insignificant failures were those failures which resulted 
from "large imperfections" such as a scratch across the entire
integrated circuit. Galiay et al. felt that such failures were insigni­
ficant since they should be easily detected by almost any test sequence.
Another study by Banerjee [4] was based on Texas Instruments' 
experience with MOS circuit failures. Failures are listed as either 
device failures or interconnect failures. The following failures were 
listed, divided into groups based on their likelihood of occurrence:
Most likely:
Device failures:
Gate to drain short 
Gate to source short 
Interconnect failures:
Short between diffusion lines
I 
 
ikeliho rt -
l echa i s e it . al . u
u - it icr r cess r }. i r r cess r as
etal t ess. i icr r cess r er
i r ti l ro icr s pe.  
icr r cess r er i t . u o low-
ib io : 
ort etalli
pe etalli ~ 
ort fu o
pe u o
ort etalli o str t ~ 
l { ,, 
i t ~ 
u a l er hi
r t i r t hi ysi al l
. i t u er u hi
rom i perfecti s"  s a r t  cr ss t e ntir  
it. ali . l t u e i-
c t l i t st ce. 
not er S l!.  anerj as ru ents' 
r it it . i ste
i u n erco t .  i er  
, i ikeliho r e: 
ost ik : 
evi : 
at rt
at rt
erco t e s : 
ort u o in  
34
Moderately likely:
Device failures:
Drain contact open 
Source contact open 
Interconnect failures:
Aluminum-polysilicon crossover broken
Least likely:
Device failures:
Gate to substrate short 
Floating gate 
Interconnect failures:
Short between aluminum lines
From these two studies and the results of Chapter 2, it appears that 
interconnect failures will be a major failure mechanism. The Galiay et 
al. study attributed all significant observable faults to interconnect 
failures. If we enlarge the concept of interconnect failures to include 
all failures that result in an open or short, then all the failures men­
tioned in the Banerjee study are also interconnect failures. The idea 
of classifying transistor failures that result in opens or shorts as 
interconnect failures is quite reasonable for MOS circuits. MOS
transistors are formed by one level of interconnect (polysilicon) cross­
ing over another layer of interconnect (diffusion) [29]. For this rea­
son, the transistor itself may simply be considered another type of 
interconnect.
Reviewing our summary of physical failure mechanisms listed at the 
beginning of the chapter reveals that all interconnect and transistor 
failures with the exception of parameters shifts and noise result in 
either opens or shorts. It must be kept in mind, however, that many of 
the failures (especially transistor failures) result in resistive shorts 
whose impedance depends on the voltage of various nodes in the vicinity
oderatel i : 
evi : 
rai t t
r  t ct 
erc t : 
llllDinam-polysili r  
east ik : 
evi : 
at st t rt
l t t  
c ct : 
ort l11JDu111111 in  
 
o i lt hapt r . e rs t
erc ct u ill aj r echani . ali t
. u rib i t l l erc ct 
u . l t n erco ct u
l t lt rt, u e -
ion anerj  u erco t . d
fy n i u t lt rt
erco t u it l it . MOS 
r l erco t l si c -
r t r r erc t fu ]. r -
, i a l t r
erco  
e i i r llll!lllar si l echa i s iste t
i f t r l t l erco t ra
n it t eter i i lt
rts. ust t i d, ever, t a
i l r lt v rt
hos l f r o i i  
(_1 
.. 
~ 
I 
j 
.,.. 
I 
35
of the failure. Radiation-induced soft errors have no correspondence to 
shorts or opens. Nevertheless, it is possible to model the effect of 
such an error as a transient short. The short creates a "wire" which 
carries the current that flows due to excess carriers generated by the 
radiation.
From the above discussion, it is possible to account for nearly all 
of the listed physical failures by considering only opens or shorts. A 
short results when a failure causes an anomalous impedance to occur 
between two nodes. This impedance may depend on the voltages of neigh­
boring nodes (as is the case for many transistor failures) and may also 
be time dependent (as is the case for radiation-induced soft errors). 
An open results when a failure causes an anomalous impedance to occur in 
series with an existing element between two nodes. This impedance may 
not be infinite since many failures (such as electromigration) tend to 
occur gradually. As was the case for shorts, the open impedance may be 
voltage and time dependent. The only failures which we haven't 
accounted for by our enlarged class of interconnect failures are parame­
ter shifts and noise. Parameter shifts of transistors will affect both 
the steady state and transient performance of a circuit. These effects 
are due to changes in the conductance of a transistor with a given bias. 
If the conductance of the channel increases, we may model this failure 
as an impedance placed in parallel with the channel. If the conductance 
of the channel decreases, we may model this as an impedance in series 
with the channel. These two situations correspond to our definition of 
a short and open, respectively.
··-
I 
J 
j 
 
f . adi t o ind ft espo
rt s. evertheless, ssi l odel t
a t rt. rt i hi
i r t t lo :r.cess i
i t . 
o ~ , ssi l t r l
ste si l si l rts.  
rt lt s al r 
w es. hi a l i -
r  a ra a
im ent ion-indu ft rs).
lt he al r
it :r.isti e e t es. i a
t i a u ro i r t ) e
r all . s as rt , a  
l im dent, l lu hi en't
t r f erc t u e-
i i . eter i ill t t
ea t orm it. es t
ct i i i .
ct el s, a odel u
rall l i nel. ct
el r ses, a odel
it nel. es tu io esp r f i io
rt , cti l . 
36
Now that we have classified failures as being either interconnect 
failures or noise, we are ready to study the effects that physical 
failures have on the behavior of various circuits. We begin by modeling 
transistors and the basic circuits used to process digital signals. We 
then use these models to study the behavior of such circuits under phy­
sical failure.
3.2. Circuit Models
The basic building block for MOS circuits is the MOS transistor. 
Figure 3.1 shows the symbols we use for enhancement and depletion 
transistors. The MOS transistor is a four terminal device. The four 
terminals are drain, gate, source, and body. For proper operation, the 
body terminal of all n channel transistors must be connected to the most 
negative voltage in the integrated circuit. A p channel transistor must 
have its body connected to the most positive voltage. Unless the body 
terminal is pertinent to the discussion, it will be ignored.
The exact relationship between drain current i^ and the voltages 
of the four terminals is quite complex. The MOS transistor has three 
regions of operation. In the off region, the drain current is approxi­
mately zero. In the nonsaturated region, the drain current increases as 
the drain to source voltage increases. Operation in the nonsaturated 
region is often approximated by replacing the channel of the transistor 
with a resistor. In the saturated region drain current is roughly 
independent of the drain to source voltage. Saturation is sometimes 
approximated as a current source between the drain and source terminals. 
A simplified model which is accurate enough for a variety of purposes
 
t u r erco t
u i . u t t ysi l
u avi r it . Yo i odeli
ra si it i it l als. Yo
odel u a i r it er -
l , 
1-1· ir it odel  
si i it i t r.
i r . s bols ent l t
i r . o . er i l i ,
i al i . s t . r e. y. r erati .
er i l f l el r i ust ct ost
at l it. el r i ust
t o ost sit lt . nles
er i l rt t i , ill g r . 
 o:r.act ion  r t  da •  t  lt  
f r i al i pln. ra
erati . ill i , r t r i-
at l . D,Qnsaturat~ , t
l s s. perat  
at el
i   r. r t l
n t i  l . t r t et es
at r t i als.
 pli odel hi r t ri s  
-
I ,., 
' J • 
-
DEPLETI ON ENHANCEMENT
D E P L E T I O N  ENHANCEMENT
Figure 3.1. MOS Transistor Symbols.
37 
N C~IANNt L 
7 
I   
P CHANNEL 
I 
O  
i . . i bols. 
38
gives the following equations:
ds
0
4 « vg, - vtk)2i
±Vgs < ±Vth (off)
±VgS 2 ±Vth. ±Vgd 2 ±Vth (nonsat o) 
±vgd < ±Vti < ±Vg» <s,t,)
The voltages are defined in Figure 3.2. p is a constant which depends 
on processing parameters and the geometry of the device, 0 is equal to
f*^ oxW/L where p is the mobility of the charge carriers* CQX is the gate 
oxide capacitance per unit area* and W and L represent the width and
length of the channel* respectively. In the above equations* where the 
sign is ±, the plus signs are for n channel devices while the minus 
signs are for p channel devices. If the threshold voltage V ^ £s 
greater than zero* then an n channel transistor is operating in the 
enhancement mode while a p channel transistor is operating in the deple­
tion mode. For a negative threshold* an n channel transistor is in the 
depletion mode while a p channel transistor is in the enhancement mode. 
The MOS transistor is symmetric with respect to its drain and source 
terminals. It is customary to assign the drain and source terminals by 
their voltages. For an n channel device* the drain voltage is greater 
than the source voltage. For a p channel device, the source voltage is 
greater than the drain voltage.
This model fails to take into account several factors. In particu­
lar, if the transistor is saturated, then the model predicts that i^g 
will be constant with respect to vd,- This is approximately true for 
long channel devices. For shorter channel devices* an effect known as
 
o low ati s : 
 s  t f  
I ds z 
(sat . ) 
o l o i r  .z. ~ st t hi
eter etr i e. ~ al  
µC Y/ hereµ obili rri rs, 0 x o t
i cita:11 r it , 'I t i t  
e nel, cti l , o ati s, her
1 , l 1 el i hil i
el i . l  t h i
t  , Ul el sh er t
ent o hil el r t l -
io ode, r ati l , el
l t o hil el i ent ode .
ra etri it t
i als. o er i al
i lt es. r el i , l t
l , r el i . l
t r lt e. 
i odel Ullt er l t r . rt -
, ra . o odel i t t ds 
ill st t it t Vds. hi i atel r
o el i es. r rt r el i s, t ,r  
39
D R A
R
GA
VGD
RR
VDS
R
VGS
7T\
VBS
SOURCE
BODY
Figure 3.2. Definitions of Voltages and Polarities.
 
O  I i\l 
 
O  
TE ~---t IE---K -----...-1'  
 
 
 
i . . efi i olt olariti s. 
40
channel length modulation occurs [30]. Channel length modulation causes
*ds t0 increase slightly as Vds increases. The shorter the channel, the 
more pronounced the effect* The simplified model also fails to account
for the influence of V^g on drain current [30] * This is the so-called 
body effect. If the body to source voltage is relatively large* then 
the change in threshold voltage is approximately proportional to the 
square root of the body to source voltage. A change in threshold vol­
tage causes drain current to vary.
Ve use a small signal model of the saturated transistor in those 
situations where these effects are important. The model we use is basi­
cally the same as the model developed in [31] . Drain current is a func­
tion of V # V^s, and Vbs. We assume that the transistor is at some 
operating point represented by Vgg, vds, and Vbs. The drain current of 
the transistor at this operating point is defined to be I. We may now 
use the Taylor's series to represent the drain current:
3ids
\is = I + 0Vgs Vds. Vbs
i k a
^gs ” ^gs^+ 3Vds - <vds - Vds> v  V. gs' bs
+ ai<?saybs _ (Vbs - Vbs) +gs* Vds
Following standard convention, we define g^ (transconductance), gd, and
Smb as follows:
d i& Z  
gm = 3V
gs|vds’ Vbs
21
^g s Vtb
aid*
gd = avd^s
XI
V Vk 1 + XV, gs' Tbs ds
-to 
el e odulati r ]. annel e odulati  
i o g t n eases rt l
or t. pli odel t 
lue bs r t ( ]. i h l
t. l iv , h
l r i at l ort l
r t l . h l-
r t r . 
Y all l odel
tu io her t portant. odel si-
l a odel ] rai r t -
tio  f  
•• • 
 bs, Y su t i t o
r t i t   V v gs• ds• an bs• he raiu. rr t f 
ra t er t i t e a
aylor' t r t: 
ids  I + ~1 av slV g • 
 • •• 
l o i a venti . f m transc ct ce),  
g lo : 
gm = ~ 1  av 
-gslV4 bs V - th •
I  ::, ~1 
-
2,I 
av lV -  ). ds • Vbs 
41
d ids
gmb = av.bs
= ___________yi . ________
Vgs- Vds (Vgs - Vt h )(2<Sf  -  vb s ) 1 /2
X, is the channel length modulation parameter. Its value is given by the 
formula:
X = _!.
vLV f H"'sub ds qN v ds ■ (vgs ■ vth^
where represents the substrate doping concentration. In the 
expression for gm^ , 6 f is the Fermi level of the substrate and y is the 
bulk threshold parameter and is given by the formula,
_ ^esjq^sub
Y Cox
For more information on the derivation of gffl, g(J, and gmb, see [31]. 
The significance of the various device parameters is discussed in [30]. 
The Taylor series expansion of ids can now be rewTitten as *
*ds - I + SmfVgj - Vgs) + gd(Vds - Vds) + gnbf^bs ” ^bs^ + •"
For a very small change from the operating point, we can ignore the
higher order terms in the expansion giving us
*ds  ^+ ®m^gs Vgs) + Sd^ds ^ds^ + Smb^bs ^bs^
3_.2.1. St at ic NMOS Inverter Model
Figure 3.3 shows the circuit diagram for a standard NMOS inverter 
using a depletion load transistor. One of the attributes of this cir­
cuit we are interested in is the input-output transfer characteristics. 
In particular, we are interested in the gain of the inverter at its 
transition point. The transition point occurs when the voltage at the
a  I 
8
"°~ lV' s• 
 
I 
= 
> (2tl Vbs}l  
~ el odulati eter.
orm : 
her Maub t st t i centrati .
r b• i l str t
l eter ula, 
y s 
r or nfor t r m• d • ( ].
 ica r i eter ],
a l r %pansi f ri e
r al 1 rom  i t, r  
r r r s s  
J z_. . t t ert odel 
i r  . s it agram an rt
l t si t r. rib i -
it n t t t ra aracteristi s. 
rt l r, i rt r t
ra io i t.  io i t r h l  
42
v-/
V *
Figure 3.3. NMOS Inverter Circuit.
VIN VOUT 
,. 
I 
\ 
J -
i  , . t ircuit. 
43
A
output of the inverter, is equal to V^n, the voltage at its inputs
The gain is the derivative of VQut with respect to Vin.
It is quite easy to find the steady state transfer characteristics 
of an inverter by equating the drain to source current of the load 
transistor with the drain to source current of the driver transistor. 
If the inverter is at its transition point, then the voltage at the gate 
of the driver transistor must be equal to the voltage at the drain of 
the driver transistor. This equality implies that V ^ 0£ (jriver
transistor is zero. Since the driver transistor is an enhancement 
transistor, the driver transistor is saturated. Depending on the param­
eter and geometry of the transistors as well as the supply voltage, 
the load transistor may be either saturated or nonsaturated. It can be 
shown that the load transistor is saturated at the transition point when
/
Vdd
thL 1 1 + p -1/2 Hr
Pj. is the ratio of Pd to J3l * If the threshold voltage of the enhance­
ment transistor is low enough to allow polysilicon to cross diffusion,
then the load transistor must be nonsaturated at the transition point.
If the load is nonsaturated, then the equation for Vout 1S
Vout = (VthL + Vdd) + [ V ^ 2 - (Sr(Vin - VthD>2]1/2
By taking the derivative of with reSpect to Vin,
gain A is:
we find that the
 
t t ert r, V0 ~t• al i • l t t. 
i r t o i t i · 
it in e ra aracteristi  
rt at r t o
i it i r t i t r.
rt r t io i t, h l t t  
ra ust l l t
i t r. i ali pli t
1
d of the d r 
i . ra ent
i t r, i t . ependi -
etr f ell l l , Vdd• 
ra a nsat rat .  
t o ra i ra io i t  
V i 
th    ~ - /  
 
~r ~D ~L· l -
ent i o low l si c i , 
o ust sat r io i t. 
o sat rat , t  1• 5 t
 = ( th + dd) + [V h 2 - ~ (V · -  h ) ] /  
out L t L r 1n t D 
 r t Voot it s t i •  in  t t  
: 
44
or
d(V0ttt>
d(vi»)
~Pr^in ~
- Pr(vin - vtiD>2]1/2
~Pr^in ^th.^
A =
v0«t - VthL - Vdd
Recognizing that at the transition point Vin = Vont* we find that the 
gain at the transition point A* is:
A* = -P<
in th,
m (Vthj " vdd>
where is the input voltage at the transition point. Its value may
be found from the following formula*
vin = p;-TTtf!rvthD + Vdd + vthL + [VthL2 + 2Pr(Vdd(VthD - VthL)
+ V - y } _ o (Vjj + V vthjjVthLJ Pr^dd thp
If we attempt to substitute V*^ into the equation for A, we find that
the relationship between A and Pr is quite complex. An approximation
for V? given in [29] is: in ®
VthT
V* a y .  - ----
ln thD a U 2  *r
Substituting this value into the expression for A*, we find
A* »
VthLpi/2
thD - Vthl(l + pr1/2> - vdd
From this expression* we can see that for large p^, a* is approximately
 
- ~ <Vi  - Vu .D) A•----------------
C 
[ V 2 A (V V )2]1/2 thL - .., r i  - thD 
-Pr (Vi: . - Vt D) 
Vout t  
 
oco3nizi t t o io i t i out• ,r n t
i o ra io i t • : 
• "' p 
r 
her V~n t l t i io i t. o a  
rom low ula, 
~ l  
tem t sti v ; n at . n t 
ion  it plos:. i ati  
! ] :
1D
t  v• __ L 
i n =:: Vth - ~1/
 
bsti n r *, n  
t Pi12 
• ::: --------------
V th - V (1 + p- 1/2) - Vdd 
D thL r 
o r ssi , t 3 P , A• i at l  
r 
-
45
proportional to 0^/^, while for small 0f, A* is approximately propor­
tional to 0r> Therefore, to achieve a large value for A * 0r must be as 
large as possible. Scaling will have little effect on A*.
If we assume the load transistor is saturated at the transition 
point and equate the currents through the load and driver transistors, 
we f ind
VthL
This equation implies that at the transition point, not depen­
dent on jn other words, A* is infinite. This anomaly is due to
the fact that in our simplified transistor model, when a transistor is 
saturated, its current is independent of # The model also entirely 
ignores the body effect. By using the simplified transistor model when 
both transistors are saturated, we have implied that the currents 
through the transistors are totally independent of Vq # The dependence
^out on a saturated transistor's drain current is fairly small. When 
one of the two transistors is nonsaturated, ignoring the effect of Vout
on the saturated transistor's drain current only results in a small 
error. When both transistors are saturated, however, the error becomes 
quite large, and we are forced to use the small signal model of the 
transistor.
Using the small signal model, if we equate the currents, we find
• I 
 
ort l 13;1 2, hil all ~r' • r d at r-
o l f3r. herefore, •, f3 ust
ssi l . l ill t •. 
su o ra ra io
i t t t h o o i r ,
in  
via.= 
VthL 
.,112 
~ 
i t pli t ra io i t, Vout ia t -
t Vin· I r ords, • it . i al  
t t r pli ra odel, ra
t , r t n t Vds · odel t
r t. pli ra odel he
t t , pli t r t
o ra n e t out ·  
of Vout i t r' r t a l. he
w ra sat rat , t  
t 
i t r' r t l lt all
r . he t ra , ever, es
i , all l odel
 
si all l odel, at r ts, n  
46
L + gdj^vdd “ ^out ^ds^ + 8®bj^ “^out ” ^ s\]
= ID + Snip (V in ^ g S r? + gd^^out«SD da
Note that the load current lacks a gffi term since Vfffi - 0 and the driverg«i
current lacks a term since Vj,{ If we solve for Vont* we get
out gmbT + gd,
■ itmbTvbsT ~ gdT(Vdd “ vdsT >
®dDVdsD + gnD<Vin " V*SD,]
To find A*, we can take the derivative of witi reSpect to Vin giv­
ing us
A* = ----------------
®dL + gnbL + gdD
For devices with moderately long channel lengths (L > 10”^m), one typi­
cally finds that
S  >> gmbL 1 gdD > gdL
This relation allows one to build inverters of reasonably high gain 
(gains between 5 and 20 are typical) . Due to the complexity of the
expressions for gffl> g^, and gm|j, it is difficult to predict the precise 
behavior of A* during the scaling process. A careful analysis shows 
that depending on the scaling rules used, some of the terms in A* 
increase and others decrease —  all at varying rates. In general, it 
appears that A* is pretty much invariant to scaling although it may 
decrease slightly if constant voltage scaling is used. If it is neces­
sary to have an inverter with a very high value of A*, the best one can 
do is to use very long channel devices. This strategy minimizes the
ot t r t m erm v,s • O r r 
L 
rr t l s a •■b term si  b, • o.  out•  t 
D 
V 
t ldL + lmbL + ldD[g LV L - l L(  - V aL) 
-
1d Vds + Im (Vin - YgsD)]   
. -
 fin  •, ~ h d i i f V   ...  t   t  out ith s et i  -
 
1 dL + 8mbL + ldD 
r i i oderat l o el 1t  -S ) i-
l n t 
hi io lo i rt l i
 i 1r S i l • ue plexit o 
r m• Id• 8 b• i lt i t i  
avi r • r ess. l l si s
t i , r •
r e -- l . eral,
ear t r t uc ri t e  11.ay
ig st t l . s-
• rt it  , st 
el i es. i rateg i i  
47
"nes g<j and gdn * The gain will still be limited by gmj, which is 
not a function of channel length. Although making the channels longer
increases the gain, it also decreases the circuit's density (each 
transistor requires more area) and in general decreases the circuit's 
speed of operation. As we will later see, speed of operation is
severely limited if the value of is large. For this reason, invert­
ers with saturated loads are preferred over inverters with nonsaturated 
loads when large values of A* are required.
Another parameter of interest is the propagation delay of an 
inverter. It is shown in [6] that the propagation delay of an inverter,
is approximately
4CL(Vm - Vth>
X d ~ 3g y®mvm
where is t ^ e voltage swing and CL is the capacitance of the load on 
the output of the inverter. If we make the simplifying assumption that
>> then we can write
4CL
Td " 3gm
Actually, this equation is only valid for the output switching from a 
logic 1 to a logic 0. It also ignores the fact that the driver transis­
tor must not only sink the current flowing from the discharging load 
capacitance but also the current sourced by the load.
An alternate approach may be taken where an on transistor is 
modeled as a resistor. Glasser [32] develops a Thevenin equivalent cir­
cuit for an inverter. The Thevenin equivalent circuit is formed by two
 
valu of d 8  . ill imi bL hiL D 
t t el t . l aki el o r 
i , it' si
i or ) er l s it'
erati . As we will later s e, sp ed of operation is 
r imi ,r . r , rt-
it o r rt it sat r
o • i . 
not er eter t at
ert r. ho t at rt r, 
~d i atel  
-cd ... 
(V  t ) 
3gmvm 
her V& h a i i o
t t ert r. a li u pti t 
Vm )) Vth• rit  
 
... --
 
rn 
ctuall , t l l t t i rom
i . g t ra -
ust t l r t lo rom
i t r t . 
a a her i
odel i r. l ss r l e i i l t -
it rt r. e eni i l t it orm w  
48
resistors* two voltage sources* and two switches. The switches open and 
close represent the transistors turning on and off. Each resistor 
represents the resistance of one of the two transistors which make up an 
inverter. Hoyte [333 implemented a simulator based on a resistive model 
of the transistor. Hoyte claimed the simulator had an accuracy in the 
range of 10 to 15 percent compared to an accuracy range of 5 to 10 per­
cent for the SPICE circuit simulator. Mead and Conway [29] also used a 
resistive model for delay calculations of MOSFET circuits.
In the resistive model of a MOSFET, the channel resistance is 
assumed to be proportional to the length to width ratio of the transis­
tor. This model in turn implies that the resistance of a transistor is 
also inversely proportional to p. We are now able to estimate the time 
for both rising and falling transitions. Let us define £0 the
highest voltage the circuit output is able to obtain, and jje
lowest voltage the circuit output is able to obtain. V® is the highest 
voltage that other gates will reliably interpret as a logic 0, while V* 
is the lowest voltage that other gates will reliably interpret as a
logic 1. for a circuit is the difference between Vj^ and Vi0.
Finally, let us define the following
V. _ V1 m  ~ v
 
r , l r es. w it . o i
t ff.
t stan o ra hi a  
rt r . oyt ( ] l ent ul t r v odel
f i t r. Boyt  aim l t r o
 t par a r-
t o it ulat r. •• 1
iv odel l S it . 
iv odel OSF  • o el tan
su ort l e 1 i t i -
. i odel pli t tan
orti al ~. e l im t im
t sin ! n l s. et f V.hi to be  
i est l it t t l t i , v10 to b  the 
o est l o it t t l t i . O est
l t t ill a r t i , hil vl
o est l t r t ill :reli l t  
i . V• for a circuit i  t o diff r ce bet een Vhi and V1o · 
i all , f low 1 
z .. 
z'. 
RLI 
Rn 1v. 
1n
0. - -4--1 
49
* 7 + 1a =
Z 1
Figure 3.4 shows the resistive model for a NMOS inverter 
the node equation for V (t), we get
Writing
^dd Vput (* ^  _ p ^o u t  ^*) ^out  ^ ^
*L = L «  *D
We are interested in solving this differential equation for two sets of 
initial conditions. One set is for a falling transition while the other 
is for a rising transition. Solving for the falling transition, we get
out^t) = ^lo + ^me
-  n g t  -■Set
Solving for the rising transition, we get
Vont<t) = Vhi + V„e
Inspection of Figure 3.4 shows that the load and driver transistors form 
a voltage divider. Therefore, the voltage limits are
lo Z + 1
and
vu  -
Vddz 
z' + 1
This information may be used to solve for V .m ■
V = y r— ^ m vddL„»Z + 1
igure 3.5 is a graph of V1q an(j Vhi versus Z and Z  > respectively. As
 
, + 1 
"' ... z~-'---=-
z' 
i r . s odel rt . riti  
  at   out <t .  t 
e i ti l at t
i diti s. t in io hil
in i , l in i , t 
l sin i , t 
t i r t o ra orm
l i i er. herefore, l i it  
 
V _!.dL 
=  
, 
v  
'  
hi n or t a  
· 
 
, 
Vdd[ ,z - r¼,-J 
   + 
F S 10 d y i r z z ' , t l . s 
50
RD
Figure 3.4. Resistive Model of an Inverter.
50 
I• 
RL 
VOUT 
O CL 
Figure 3.4. Re istive M l of an Inverter. 
51
Vhi vio
Voltage (Vdd)
Figure 3„5» Voltage Limits vs. Z.
Sl 
Vhi Vlo 
Voltage (Vdd) 
.8 
.6 
.4 
.2 
------ -- - -
-
0 
0 2 4 6 8 10 12 14 16 18 20 
Z or Z' 
Figure 3.5. Voltage Limit s vs . z. 
52
Z and Z are increased# decreases and increases until Vj0 and 
^hi eventually approach the ground and power supply voltages. Since Vm 
is defined as the difference between Vj^ and ylo, large values of Z and
Z will maximize For the NMOS inverter, Z is proportional to 0r. 
In order to make Z large enough to provide proper separation between 
logic levels, it is necessary to make 0D greater than 0L . This is usu­
ally done by having the W/L ratio of the driver transistor much larger 
than the W/L ratio of the load transistor. Unfortunately, this restric­
tion requires extra area. Z* is infinite for an NMOS inverter since the 
driver transistor is off during the rising transition.
We are now in a position to calculate the switching time, The
switching time is the time taken to switch between V® and V*. To calcu­
late the rising switching time, set the equation for vout(t) (rising
transition) equal to V*- and solve for t. This value of t is x ^  m i^e
r
falling switching time, is found by setting the equation for
vont(t) (falling transition) equal to V°» and once again solving for t. 
The following values for and Tgw „ „  tbus obtained
RdCl , r jj!  ,,
Tsw. = - —  lnto V^T -i dd
r SV = -ZRDCLln[o (1 - V )] r dd
The average switching time, tg^ , is the average of the rising and
ave
falling switching times. It is given by the following formula
rDcL
SWave
aa
For small values of Z, the
ln[a + Zln[a (1 - ^~)])
dd dd
average switching time is dominated by the
 
' . V10 Vhi til 10 
Vhi t all er l l ,
n fere hi v10 an  
' ill axi iz v.. r Jf rt r. ort l ~r·
r a o i r r t
l , la at ~ ire h ~ - i -
iu YI/ o i u
o / o o i r, nfort natel , -
io i zt , ' i rt n
r r sin r i . 
Y si o l t i i , ~SW•  
i im im a i v0 v1 • -
sin i i e, t t V  sin  
ra io al v1 rt hi f ~SW. Th  
 
in i im , ,:SW , in at  
t 
Vout(t in ra io al v0 , rt
he following al es f r ~SW and ~s  are thus tai e  
f t 
~SW 
f
 -
RDCL ln[a ~ - 'ZlJ 
a vdd 
it i . "'SW sin  
 
in i i es. low l  
"'sw 
 
RoC  1 ~ 1 ~ 
• - - 2 - {4 V - zl l [ V )]}  
r all l z.  i im i t   
53
falling transition. For large values of Z, the average switching time 
is dominated by the rising transition. Figure 3.6 shows a graph of 
average switching time (in units of RCL) as a function Qf Z. In the 
graph, it is assumed that the ratio of V° to Vdd is q .4 , while the ratio
of V* to Vdd £s assumed to be 0 .6 . Notice how the average switching 
time rapidly approaches ® as approaches V °  • For this example, the
minimum average switching time occurs when Z is approximately 2. 
Although a value of Z = 2 may optimize the average switching time, such 
a low value is usually unacceptable due to the resulting inverter's low 
gain and low noise margin. Therefore,a larger Z ratio is typically 
used.
In this section, we have dealt with a NMOS inverter. The analysis 
of a PMOS inverter is identical. Equations for gain, voltage limits, 
output voltage, and switching time are all the same except that the sign 
of supply and threshold voltages must be changed to be appropriate for 
PMOS devices.
3.2.2. SfcjL tic CMOS Inverter Model
Figure 3.7 shows the circuit model for a CMOS inverter. The load 
transistor is a p channel MOS transistor while the driver transistor is 
an n channel MOS transistor. At the transition point, both transistors
have a Vds 0f o volts. Therefore, since is positive and is
D L
negative, both transistors are saturated. If we attempt to use the sim­
plified transistor model we would once again arrive at the result that 
A* is infinite. For this reason, we immediately proceed to the small 
signal model. Equating currents, we find that
S  
in i . r l , it im
inat sin r i . i r  .
 i  im   it     t  o  z   
, su t yO 0, hil  
vl is st n  oti i
im "° V lo v0 r i  ple,  
i imu i im r i at l ,
l a t i i i ,
o al cept l l n ert r' o
i o i argi ,
. 
eref r , i l
t , alt it ert r. l si
rt ti al. uati i , l i its,
t t l , i im l a t t g
l l ust r pri t
i es, 
1. .~. ta   ert r W.U 
i r . s it odel ert r.
i el i hil r
el i t r. t ra io i t, t r  
o O lt . herefore, i  Vth is positi e and Vth  
  
ati , t ra . te t -
l e i odel oul i lt t
• it . r , ediatel all
l odel. uat rr ts, in t 
54
Average switching time (RC)
7
Figure 3.6. Average Switching Time vs. Z Ratio
verage itc ing ti e ) 
 
6 
5 
3 
2 
1 
1 2 4 5 6 
z 
7 8 9 
Figure 3.6. Average Sw tching Time vs. Z R tio . 
54 
10 11 I -
gure 3,7. CMOS Inverter Circuit.
55 
V I N 
Fi  . . OI  ert ircuit, 
56
D 4 gm^'^in ^gs^ 4 gd^^out ^ds^
L “  g m ^ ^ in  ~ ^dd “  ^ g s ^  “  S d ^ ^ o tit  ~ ^dd ”  ^ds-^
The load current terms are all negative since the drain to source 
current of the load transistor flows in a direction opposite to the 
drain to source current of the driver transistor * Also there is no body 
effect term for either transistor since the body to source voltage is 
always 0 for both transistors. This equation can be solved for Vout
giving us
out “ g
dD + gdL
[*dDVdsD + gdL<vdd + vd,L) - g ^ V ^  - Vdd - vgSL)
v v< W 1
To find A*, we take the derivative of with respect t0  v in  giving
A*
ga>L + 8mD
In many situations# the load and driver transistors are designed to have 
identical characteristics so that the circuit response will be symmetri­
cal and v in = Vdd/2. In such situations, gm^ ~ gd^. If this is the 
case, then A* simply becomes:
A*
gmD
'“d
The value of A* is only dependent on the values of gffl and gd of the two 
transistors. For this reason, CMOS inverters can be built with higher 
gain than NMOS inverters. By making the channels very long, can be 
made quite small. For driver transistors of the same size, the gain of
 
t erm at o
t o i low o osit
t i r. l
t erm ra l 1
s O t i r . i at out 
 
Im (ViA - V11 ) ] D D 
 in  •,  ta     Voot it  t o Vi   
a u s. o r r g
ti l aracterist t it ill etri-
l ~ : dd/ . tu , ~ - ldL.
, h • l es: 
   • l  e t   l   m     tw  
i r . r , rt ilt it r
a ll ert rs. aki el . Id 
a it a l. r ra a ,  
I 
57
a CMOS inverter is typically 3 to 4 times greater. During the scaling 
process, A* decreases slightly regardless of whether constant voltage or 
constant field scaling is used.
If the response of a CMOS inverter is to he symmetric, then .
This implies that the rising and falling propagation delays are roughly 
equal. Also, except when the inverter is near its transition point, 
only one of the two transistors is on. Because of this, Spans
full range from 0 to The expression for given in [6] applies to
this case giving us
4Cl
3gm
where the value of gfl corresponds to the on transistor. The equations 
derived for the NMOS inverter delay and output voltage also apply to the 
CMOS inverter. In this case, both Z and Z* are infinite.
1-2.3. Dynamic NMOS Inverter Model
In order to reduce power consumption and increase packing density, 
dynamic circuits are becoming quite popular. Since dynamic logic is 
typically ratioless, it usually requires much less area than equivalent 
static logic. More importantly, dynamic circuits have very low power 
consumption. The only power consumed is that required to charge and 
discharge nodes. Dynamic circuits are fundamentally different than 
static circuits. In a dynamic circuit, information is represented by 
the presence or absence of charge on a node. A dynamic circuit 
processes information by charging and discharging nodes, and transfer­
ring charge from one node to another. The most important difference
\ 
S  
rt l im r at r. uri o
ess. • ig l het er st t l
st t n . 
rt b metri , h ~r-1  
i pli t sin in at l
al, ls , t rt ra io i t,
l ra . e s , V• s  the 
l rom Ot Vdd· r ~d 1 li  
 
L 
Td ... 3  
 
her m esp r i r. at
rt t t l l
ert r. . t an z' it . 
~. . . yna i ert odel 
r er pti i sit ,
i it i it ular. i i
i l , al i u i l t
i . or portantl , i it ow er
pti n. l er u t i
es. y i it u ental t h
it . i it, n o at
e.  i it
n or t es, a -
n rom t er. he o s t i portant if e 
58
between dynamic and static circuits is that static circuits are restor­
ing. If an external force disrupts the operation of a static circuit* 
the static circuit opposes the disruption. A dynamic circuit is not 
able to oppose a disruption. Dynamic circuits are very sensitive to 
charge leakage* changes in device parameters* and clock skew. They are 
also sensitive to ionizing radiation which can erase the charge stored 
on a node. For these reasons* dynamic circuits might be a poor choice 
where high reliability is a necessity. On the other hand* since the 
power consumption is low (and thus circuit temperature is low) and the 
currents tend to be pulses rather than constant (and thus electromigra­
tion is less likely)» dynamic circuits may offer advantages for long 
term reliability.
A great variety of dynamic circuits exist [34]. Dynamic circuits 
range from bootstrap drivers which can drive large capacitive loads to 
dynamic CMOS circuits which can implement very complex logic functions. 
The circuit we examine is perhaps the simplest dynamic circuit* the two 
phase ratioless shift register. Figure 3.8 shows the circuit diagram 
for a 1 bit section of the shift register. The circuit uses two nono­
verlapping clocks* an(j An inspection of the circuit shows that
and ^2 serve the function of both power and ground and that there is 
no way for a static current to flow. The circuit samples while
is high. When dj goes low, node 1 is the complement of Vin 's value when 
^ 1 was high. Node 1 is sampled while ^2 is high. When ^2 goes low, 
^out becomes the complement of the value of node 1 when ^2 was high. 
Therefore, when goes high* Vouj. has the same value as V o n  the
1 
 i it t it -
. ui t l t r t it,
it s i t .  i it t
l t . y i it si
e e, i eters, .
si  i hi
e. r s, i it i t r i
her il essit . ,
er pti o it e perat r o )
r t l h st t ro i r -
io ik , ni it a t o
erm l ili . 
 t r i it z:ist ]. yndi it
rom otst hi acit
.a i it hi l ent pl i t s.
it i plest i it,
ift i t r . i r . s it agram
it ift i t r. o it -
r s, •1 d 62. t it s t 
'1 6 t t er t
a t lo . it pl V. hil  61 
Ul 
i , he 62 . pl ent i he
'1 as . o pl hil 6 . l' 6 ,
Vout e pl ent h 6 ,
eref re. h ;1 , t a in on  
I-
59
/
V I N
PH I PH I 2
T2
T1
T5
T3 T 4
VOUT
T 6
PH I 1 PH I 2
Figure 3.8. Dynamic Shift Register.
PH I 1 
I 
1--------1 1 T2 
 I  
PH l2 
VOUT 
P l 2 
Figure 3.8. Dynamic Shift Register. 
59 
60
previous clock pulse. In other words, while ^  high, is a
delayed version of
The transistors in the circuit can he broken into two groups, the 
inverter transistors which make up the inverter and the sampling 
transistors, an(j T4 , which couple together the stages. The inverter 
transistors (Tj, X3 , T5 , and Tg) are grouped into pairs that form
inverters.
The function of the inverter transistors is to charge and discharge 
the inverter's output node. Charging of the output node is primarily 
performed by the load transistor while 6 is high. If the gate of the 
driver transistor is high while b is high, the driver transistor also 
assists in charging the node. The output node is discharged by the 
driver transistor while b is low but only if the gate of the driver 
transistor is high. The load transistor is off whenever b is low.
The sampling transistors serve the purpose of coupling the output 
node of one inverter to the input node of the next inverter. The gate 
of a sampling transistor is always connected to one of the two clock 
signals. When the gate goes high, an inverter is able to sample the 
output of the preceding inverter. When the output node of the preceding 
inverter is low, then the input node of the current inverter is 
discharged. The discharge path is through the sampling transistor and 
driver transistor of the preceding inverter. If the output node of the 
preceding inverter is high, then some of the charge already stored on 
the output node is transferred to the input node. Due to the charge 
being split between two nodes, the voltage after sampling at the output
l so .
l  i   Vin· 
 
r ords, hil t,1 ia , v0111  
ra it b s,
rt hi a rt r pli
r i r , T1 d hi l s. rt
i < 2, T s 6) i t orm 
r  
t rt r
erter' t t e. ar i t t ari
orm  i hil t, . t
ra hil 6 a , r i
i t e. t t
i hil (, o,r t l t r
i . o i f henever; o . 
pli i l!. l t t
 rt t t rt r. t
pli i s ect
als. he t e, , rt l pl
t t ert r. he t t
rt o , t t rt
. c h o pli r i
i o rt r. t t
rt i , o re
t t a t e.
lit es. l pli t t t 
61
node is less than it was before sampling. The input node voltage after 
sampling is always less than the output node voltage it sampled. The 
output node capacitance must be much greater than the input node capaci­
tance, otherwise the input node may never be charged to a satisfactory 
level.
In order for the circuit to operate properly, the clock pulses
and ^2 must be high long enough to charge both the input and output node 
and discharge the input node. The output node is charged up by a
saturated transistor. Using the formula given in [29] for charging a 
capacitance through a saturated transistor gives
Vout<t> = vdd -  v th  -
From this equation, we see that will never be charged above -
^th• Figure 3.9 shows the resistive model of the coupling transistor. 
We can use this model to calculate the time required to charge the input 
node through the sampling transistor. The loop equation is
out 
''out dt
Solving for V.a(t)j we find
dVnnt(t) Vont(t) - Vi„<t)in'
R + c
dVin(t) 
in dt = 0
V. (t) _ v Cput rt t/RCoutvLI - e J
In this circuit, V _ vm v
charged past this point.
m C. + r  m  uout
m - Vth since the output node will never be
From these equations, we can calculate the time required to chargi
out *the input and output nodes. Notice that both equations depend on C
,  
as r pli . t l
pli p11t l pled.
t t i ust uc t h t ci-
, i t a r sfa
e  
r it er t erl , l  ~l 
~ ust o t t t t
t . t t  
i t r. si l ]
ci h o r  
o ati , t Vout ill r  Vdd 
Vt · i r . s v odel l i t r.
e odel l l t im i t
hro pli i t r. o at  
~(t) ~!(  - Vin(t) in(t
- Cout dt-- +  + Ci n dt = O 
l i  f r i n(t), e fin  
it , m • ydd 
st i t. 
t t t ill r  
o ati s,  l t im i r e 
t t t es. oti t t at  
t· 
62
VOUT(T)
R V I N ( T )
COUT C I N
Figure 3.9. Resistive Model of Coupling Transistor.
1 
(   (  
UTI 
Figure 3.9. esist odel upli r nsist r. 
63
The equation for depends not only on COU£ , but also the relative
sizes of C.n and C<mt.
3.3. Response of Failed Circuits
We have discussed the type of failures that may occur in MOS cir­
cuits in Chapter 2. We have developed models of MOS circuits in the 
previous section. In this section, we use these models to predict the 
response of failed circuits.
3.3.1. Response of Circuits with Shorts
Figure 3.10 shows an NMOS and a CMOS inverter. If we ignore the 
power and ground nodes, we see that each type of inverter contains two 
nodes, an input node and an output node. Therefore the possible shorts 
that are internal to an inverter are
(1) Input node shorted to power or ground
(2 ) Output node shorted to power or ground
(3) Input node shorted to output node
(4) Power shorted to ground
If the input node is shorted to power or ground, we may model this as 
the output node of the previous inverter being shorted to power or 
ground. We therefore only need to consider three cases.
If the output node is shorted to power or ground, then we have the 
impedance of the short in parallel with the impedance of the transistor. 
If the impedance of the short is much less than the impedance of the 
transistor, then the output will be stuck-at 1 or stuck-at 0 , depending 
on whether the short is to power or ground. If the impedance of the
 
t Vin s t l  ont• t  
  1n an out• 
~-1•  su i le ir it  
e u t a r i -
i hapt r . e odels it
t , t , odel i t
f e it , 
i.1 . es ns .Q.f ircuits rub horts 
i r  . s rt r.  
er es, t f rt r t i
es, t t t e. er f r ssi l rt
t l rt r  
t r er n  
) ut t r er  
t r t t  
e r  
t r er , a odel
t t rt r er
. e l si r s . 
t t r er ,
rt rall l it i .
rt uc  
i r, t t ill - t tt - t o. i
het er rt er d.  
64
NODE 1
NODE 2
NODE 1 NODE 2
Figure 3.10. NMOS and CMOS Inverters.
 
 
 ~ I 
1 
J 
  
I 
1 
i . , erters, 
65
short is much greater than the impedance of the transistor, then the 
short will have no effect on the operation of the inverter. A more 
interesting situation occurs if the impedances of the short and transis­
tor are of the same order of magnitude. The impedance of the shorted 
transistor can be replaced with the parallel combination of the transis­
tor impedance and the short impedance. If the short occurs in an NMOS 
inverter between the output node and power, then the value of Z 
decreases while the value of Z' increases. ° These new values for Z and 
Z may be used with the equations already derived for inverters. The 
decreased value of Z causes an increase in V1q and the rising transition 
switching time. If the short occurs between the output node and ground, 
the value of Z increases while the value of Z* decreases. In this case,
hi decreases while the rising transition switching time increases. For 
a CMOS inverter, an output node to power or ground short causes Z or Z*, 
respectively, to be reduced to a specific finite value, whereas under no 
failure they can be treated as effectively infinite. This decrease in Z 
or Z will either increase the falling transition time and increase 
or increase the rising transition time and reduce V^.. jn addition, the 
CMOS gate now dissipates static power.
To summarize, a short from the output node to ground decreases the 
falling transition switching time, increases the rising transition 
switching time, and reduces . A short from the output node to power 
decreases the rising transition switching time, increases the falling 
transition switching time, and raises V
 
rt uc t h i t r, h
rt il 1 t er t ert r.  or
in tu io r c rt ra -
r agnit . r
i it rall l binati i -
rt edance. rt r
rt t t er,
hil ' s s. · l
z' a it t rea rt r .
  10  sin io
it im , rt r t t ,
hil z' r ses. , 
V i hil sin io i im s. r
ert r. t t er rt z'
cti l , e cifi l . her r
e it . i
z• ill in ra io im v10 
sin ra io im h i • I it ,
t i er. 
marize, rt rom t t
in a io i i , sin sit
i im , Vbi ·  rt rom t t er
sin ra io it i , n in
ra io it im , v1 o· 
66
Figure 3.11 shows the situation that exists when the output node is 
shorted to the input node. By recognizing that the short and the driver 
transistor's gate together form a distributed RC network, we see that 
the circuit is of the same form as a phase-shift oscillator. The 
inverter forms the inverting amplifier while the short and driver 
transistor's gate together form the phase-shift network which serves to 
feed a delayed version of the inverter's output back into its input. 
The frequency of oscillation, a>0 is given in [3 5] as;
W0 = RC~
where R is the resistance of the phase-shift network and C is its capa­
citance. The conditions necessary for oscillation are studied in [36],
where it is shown that the gain of the amplifier must be less than -29 
for sustained oscillation. From our discussion of A*, it is fairly 
unlikely that an NMOS inverter would have the required gain for sus­
tained oscillations. This value of gain is not unreasonable for a CMOS 
inverter, especially one that was deliberately designed for high gain. 
In order for an inverter to have high gain, it must be operating near 
its transition point. If the input to the inverter is driven to either 
a logic 0 , or a logic 1 , the inverter will not be able to oscillate. 
There are three conditions where an inverter of sufficient gain has the 
potential to oscillate:
(1) The circuit driving the failed inverter is not capable 
of driving the failed inverter's input a significant dis­
tance from its transition point. It is much harder to drive 
such a failed inverter than a good inverter.
(2) The failed inverter's input is coupled by a pass 
transistor to the previous stage. Any time the pass
 
i r tu io t i t h t t
r n t e. i t rt o
si t r' t o orm i rib or . t
it rm ase-shift il r .
rt r r n plifi hil rt r
i t r' t orm ase-shift hi
rt • t t t .
 requ il , w0  : 
1 
11)0 - inf' 
er  ta ase-shift  -
e. dit i io u . 
her t plifi ust
n il . o r  •.
l t rt oul i -
n cil s. i t l
ert r, ci l t l g i .
r rt 1 i , 111.ust r t r
ra io i t. t rt v
i O i . rt ill t l cill t .
er it her rt t
t nti l cill t : 
 l it e rt t l
le ert r' t i t i -
a rom ra io i t. uc r
le rt rt r, 
le rt ' s t
i .
l
im  
61
Figure 3.11. Output Node to Input Node Short.
67 
-~--1 
1 
T 
Figure 3 .11. Output Node to Input Node Short. 
6 8
transistor is off# the failed circuit may begin to oscil­
late .
(3) The circuit driving the failed inverter switches the 
failed inverter's input. As the input moves through the 
transition region# it may oscillate until the driving cir­
cuit is capable of forcing the input a significant distance 
from its transition point. As mentioned in (1 )# this takes 
longer than it would for a good inverter.
If the gain of the inverter is insufficient to sustain oscillation# 
the result of an input to output short is to shift the inverter's logic 
levels and increase its switching time. The exact nature of these 
shifts is dependent on the impedance of the short# the value of 
impedances of the load and driver transistors# and the impedances of the 
load and driver transistors driving the failed inverter. If the 
impedance of the short is very large (at least a factor of 1 0 larger 
than the transistor's impedance)# it will have little or no effect on 
the circuit. As the short impedance becomes smaller# the difference 
between and becomes smaller and smaller. For an impedance of
^in equals Vout. Depending on the impedances of the driving 
inverter, the failed inverter, and the short# will range anywhere
from ground to the supply voltage. A situation of particular interest 
occurs if the driving inverter and failed inverter are identical and the 
short resistance is small. Figure 3.12 shows the resistive models for 
the failed inverter including the output stage of the driving inverter. 
Two cases are shown, namely a logic 0 and logic 1 input to the driving 
inverter.
If the input to the driving inverter is a logic 0, then we effec­
tively have the parallel combination of the load transistors of both
i ff, le it a i il-
. 
 o i v n h le n r i h
e erter' t. s t oves hrou
io . a il til i -
it l n t i t a
rom io i t. s enti .
r oul rt r. 
 
rt ff t t il ,
lt t t t rt ift ert r' i  
l i i e. z t t
i t e t rt, Vin• the 
c o o r i , a ces
o i le ert r.
rt a t r
i t r' edance), ill t
it . s rt es a ler. fere
Vin V011t es a l r  a ler. r f 
0, V. 
1n al epe di e a ce
ert r, e ert r, rt, Vout ill her  
rom l lt a ,  tu io rt l t 
r rt le rt ti l
rt stan a l, i r s v odel
e rt n u t t a ert r.
n. el i O i t
ert r. 
t n rt , -
iv rall l binati o t  
I r 
r . 
69
(A) LOGIC 0 DRIVER INPUT
fB'l LOGIC 1 DRIVER INPUT
Figure 3.12. Resistive Model of Failed inverter.
69 
VOUT 
G  O   
RSHORT 
VOUT 
(B) LOGIC 1 DRI ER INPUT 
Figure 3.12. Resistive Model of Failed inverter. 
70
inverters trying to pull high while the driver transistor of the
failed inverter tries to pull low. By setting the load currents
equal to the driver currents, we find that
Vont - VthD + VtkL [ 2pr ] 1 / 2
In deriving this equation, we have assumed that the load transistor is 
saturated. As this equation shows, the output, (which should be a logic 
1 ), is significantly lower than
If the input to the driving inverter is a logic 1, then the load 
transistors of both inverters try to pull high, while tie driver
transistors of both inverters will be trying to pull the output low. If 
we assume the current through the failed inverter's driver transistor is 
very small, then the steady state value of Vo^t is t ^ e same as an 
inverter which has a value of Z which is half of the original inverter's 
value of Z. Therefore, the value of lowered while the value of
is raised. When the input to the driving inverter is a logic 1, it 
is possible for to become greater than Vout when the input to the
driving inverter is a logic 0. Furthermore, the speed of operation of 
the failed circuit is reduced considerably. This reduction is due to 
both the degraded values of V^. and ylo and the fact tliat the failed 
inverter output must drive the load capacitances of both inverters.
One interesting variation occurs if the input node is shorted to 
the output node and, simultaneously, the connection from the previous
Jstage is open circuited. As long as the impedance of the open circuit 
to the previous stage is very large, the input and output node of the 
failed inverter charges to VT regardless of the impedance of the short.
 
rt n ll V oot hil r i  
le ort ll Vout o . in r t  
al rr ts, n t 
ati , su t r i
t . s at s, t ut, i l i
, i t o er h  Vdd• 
t rt i , 
ra  t  rt   ll V011t . hil  th   
i t rt ill y n ll t t o .
su r t hro le ert r' i
 all , th  t  tea  l    u  i th  a    
rt hi hi l l ert r'
. herefore, Vh l la o r hil l  
Vlo , he t rt ,
ssi l Vout o t r 011t e o t  
rt i O. r ore. r t
le it si er l . hi t
t  t   l  f h i  v10  t  f t th t t  ile  
rt r t t ust i t ert rs. 
in ri r  t r
t t , a tan sl , o ect rom  
a i , s
1 
it
a , t t t o
le rt ~ r l o rt. 1D 
71
There are two likely ways to get a simultaneous open to the previous 
stage and short from input node to output node. One way is for the gate 
of an inverter's driver transistor to be coupled to the previous stage 
with a pass transistor. Whenever the pass transistor is off, the open- 
short condition would exist. A second way to get a simultaneous open- 
short would be for metal migration to cause an open. The accumulated 
metal could then form a short.
A short from power to ground can have catastrophic consequences. 
If the impedance of the short is very small, then the voltage difference 
between the power and ground lines would become quite small. In this 
case, the output of all circuits supplied by these power and ground 
lines would be unpredictable. In order for the power and ground line 
voltages to change appreciably, there would have to be a large current 
flowing through the short. Electromigration and/or ohmic heating of the 
short and power and ground lines would lead to one or more of these 
lines almost instantly failing (most likely the short) which would allow 
the power and ground lines to return to their original values. If the 
impedance is large enough not to reduce power supply voltage signifi­
cantly, the short should have little effect; at least for the short run. 
The short increases the power dissipated from the integrated circuit and 
thus raises the temperature locally. It may also encourage electromi­
gration to occur along power or ground lines which must now carry 
heavier currents than they were designed for. A power-ground short in a 
CMOS circuit due to latchup may be able to sustain heavy currents for a 
long period of time before the latched CMOS device or a power or ground 
line fails.
 
er ik ,r s t l i
rt rom t t t e. a t
erter' ra l  
it i t r. henever i ff. -
rt dit oul ist.  t l -
rt ,r nl etal igrat . t1 ulat
etal l rm rt. 
rt rom er ro i s ences.
rt a l, l fere
er in oul it a l. i
, t t l it l er
1 oul  :o.predicta l r er 1 i
l r ci l , oul r t 
low hrou rt. l t igrat / i t  
rt er 1 oul or
in st t in ost ik rt hi oul lo
er in i r l l s.
t ,rer l l ifi-
tl , rt l t; t t rt .
rt n er rom it
e perat r ll . a  ro i-
o r er in hi ust
i r r t h er g r.  er- rt
it atch a l t r t
o im r atch i er  
in i . 
72
We have now studied all possible internal shorts in NMOS and CMOS 
inverters. An NMOS NOR gate behaves in a similar manner for internal 
shorts. NMOS NAND gates and CMOS gates have a structure of stacked 
transistors. In such a stack* the drain of one transistor is connected 
to the source of the next transistor. The first transistor in the stack 
has its source connected to the power or ground node. The drain of the 
top transistor in the stack is connected to the output node. A drain to 
source short of any of the transistors in the stack may be analyzed by 
the same procedures as those used for the NMOS and CMOS inverters. The 
most difficult situation to analyze occurs when a gate to drain short 
occurs. The analysis is basically the same as for the inverter except 
that more transistors must be considered. The results will be the same; 
voltage levels and speed of operation will be degraded.
Dynamic circuits are much more susceptible to shorts than static 
logic circuits. Dynamic logic depends on the ability to store charge on 
the stray capacitance of nodes. Any short* whether to another node* a 
clock signal* or the substrate (ground)* will allow charge to leak on or 
off the node. If enough charge enters or leaves a node* the information 
stored there is destroyed. An RC time constant determines the time 
required to charge or discharge a shorted node, where R is the resis­
tance of the short and C is the node capacitance. If the RC time con­
stant is much longer than the clock pulse* the circuit should be unaf­
fected. If RC is of the same order of magnitude as the clock pulse or 
smaller* the short will be able to alter the voltage of a node signifi­
cantly. If the short is almost able to completely charge or discharge a 
node during one clock pulse* the node will appear to be stuck-at 1 or
 
e u l ssi l l rt
rt rs. t il r anner l
rts. H t t r ac
i r . , ra t
o t i t r. r i
ct er e. ch i
i a ect t t e.  
rt a a
r o ert rs.
1110s t i lt tu o r h t rt
rs, l si si l a rt t
t or ra ust si r . o lt ill e;
l l r t ill a . 
y i it u or ti rt  
it . y i i il
ra i a o es. rt, het er t r e,
al, st t oUDd), ill low
e, t e e. n o at  
o , im st t i im
i r e, er  -
rt  acit e. im -
t uc r h l . it l f-
. o &1:1e r agnit l
a ler. rt ill l l ifi-
tl , rt ost l plet l
r  :t l . ,ril 1 r - t  
I 
I -
73
stuck-at 0 depending on whether the short is charging or discharging the 
node. If the short is unable to charge or discharge the node completely 
during a clock pulse, but is still able to alter the node voltage signi­
ficantly, then the circuit may or may not operate correctly. This 
situation is somewhat analogous to the shifting of an<j in static 
circuits. The most critical determinate of maximum clock speed for this 
circuit is the time taken to charge and discharge the input node of the 
inverter. A short occurring at either the input or output nodes, signi­
ficantly increases the time required to perform these operations.
In addition to internal shorts, it is also possible for external 
shorts to occur between inverters. We again treat external shorts as if 
they occur between output nodes. Let us first assume that the short 
does not introduce feedback. That is, neither of the shorted outputs is 
a function of the other. If both outputs are the same value, the
behavior of the two outputs is generally unaffected. If the outputs 
have complementary values, several possibilities may occur. If the 
impedances of the short and one of the inverters are much less than the 
impedances of the other inverter, then the inverter with the larger 
impedances will follow the output of the other inverter. If the 
impedances of both inverters are similar, then the exact behavior will 
depend primarily on the impedance of the short. The two load transis­
tors, coupled by the impedance of the short will be trying to pull both 
output nodes high while one of the driver transistors will be trying to 
pull the output nodes low. See Figure 3.13. The voltage at nodes 1 and 
2 will be:
 
- t O i het er rt
e. rt l plet l
r l , t l l i-
t , h it . a r 111ay t er t rr ctl . i
tu io hat in V10 d Vh i 
it . ost l i at a m l
it im a t
ert r. rt r n t t t es, i-
c t im i orm erati s. 
it l rts, ssi l t l
rt r rt rs, e i t t l rt
r t t es. et su t rt
t rod ee . at i r t t
o t r. If both outputs are the sa111e value. the 
avi r t t eral aff t . t t
pl entar l s, l ssibilit a r. t  
a ces rt rt uc h
a ces r ert r, h rt it
e a ce ill low t t ert r. t  
a ces t rt il , h t i r ill
ari rt. o ra -
, l c rt ill ry n ll t
t t hil ill ry n
ll t t o . r . . l t
ill : 
74
Figure 3.13. Resistive Model of Two Outputs Shorted Together.
74 
Z1R1 Z2R2 
V ( 1 ) V(2) 
R1 
Figure 3.13 , R stive M l of Two O p s Shorted T . 
75
V(l) = V ZjRj + Z2 R2 + Rshortdd Z1z2R2 + zl Rshort + zi Rl  + z2r 2 + Rsliort
V(2) = v --- ?sbgyt.—  + y(i) ---- -.2— —
dd Z2«2 + Rshort Z2R2 + Rshort
If the two inverters are identical* then
V(l) = V(2) vdd
In this case, the effective value of Z has been reduced by one-half. In 
addition to degrading the steady state output values, a short also 
reduces the speed of a falling transition since both inverters' load 
capacitors have to be discharged by one driver transistor. As mentioned 
in Chapter 1, many people use a wired AND operation assumption to model 
shorts between outputs. An examination of the equations for V(l) and 
V(2) shows the wired AND operation assumption is only justified if the 
value of the short resistance is small and R-^ an(j are both much 
smaller than either Z^r  ^ or Z2R2 •
If a short occurs between two output nodes where one of the outputs 
is a function of the other, then we have feedback. If this feedback 
loop includes an odd number of inversions, oscillation is possible. Let 
us define the looped inverter to be the inverter whose output is a func­
tion of the other inverter's output (that is the inverter inside the 
feedback loop). We refer to the other inverter as the unlooped 
inverter. In order for oscillation to occur, the looped inverter must 
have a driver transistor with much lower impedance than that of the 
unlooped inverter's load. An algorithm is given in [37] to predict
(l
rt ti al,  
(l) "" ( ) 
z + 1 %' 
 
s , l e- alf.
it i ea t t l es, rt
in ra io t ert rs'
cit r c i t r. s enti
hapt r , a l i r t u pti odel
rt hr t uts. inati f t (l)
( ) s i er t u pti ie
rt stan all 1 d ~ t uc
rt r t t her t t  
o r, , ee
o ber , il io ssi l . et
o rt rt hos t t -
io r ert r' t t t rt
ee ). e r rt
ert r. r il io ur, o rt ust
i it nc er t
ert r' . ithm ( ) i t 
76
whether or not feedback bridging faults will lead to oscillation* * The 
algorithm is useful for determining whether or not a complicated circuit 
will oscillate with a given input vector. Unfortunately this work is 
based on the wired AND or wired OR assumptions which may not be applica­
ble. If the inverter which is out of the feedback loop has much lower 
impedances than the inverter in the feedback loop, then the inverter in 
the feedback loop's output will follow the other inverter's output.
If the feedback loop encloses an even number of inversions, the 
circuit will generally not oscillate. The inverters inside the loop 
will form a latch circuit. If the unlooped inverter has lower 
impedances than the looped inverter to which it is shorted, then the 
latch will change state each time the input to the unlooped inverter 
changes. If the looped inverter has the lower impedances, then the out­
put of both inverters will appear to be stuck-at 0 or 1, depending on 
what value is stored in the latch. Under very unusual circumstances, it 
is possible for the latch circuit to exhibit metastable behavior. This 
behavior is discussed later in Section 3.4.2.
3.3.2. Respp-ps.c o l  Cixsaita ylth Opens
Opens that occur in series with transistor channels are very easy 
to analyze. For all three types of circuits we have studied, it is only 
necessary to replace the open transistor with the series combination of 
the channel resistance and the open resistance. This new resistance may
*
Questions have been raised about several of the theorems in this pa­
per (see [38]). The disputed theorems all concern the detection and lo­
cation of bridging faults, not the conditions necessary for oscillation.
 
het er t l ill e i ation. • 
g ithm f l i het er t pli t it
ill scil it t t r. nfort at l or
i i pti hi a t l -
l . rt r hi t ee o u o er
a ce rt ee , h rt
ee ' t t ill low ert r' t ut. 
o l ber r s,
it ill erall t cill t , rt o
ill orm c it. rt o r
e a c o rt hi rt ,
ill o b im t rt
ges. o rt o r edances, h t-
t t rt ,rill - t O , i
at , nder s al rcu st ces.
ssi l c it s i it etast l avior. i
avi r t . . . 
l•l••· os on o u ircui s r il ()g  
pe s t r it ra el
y%e. r l it , l
it binati
el stan a . i tan a  
• uesti t r l heore -
r }). he e l t o -
lt . t dit il . 
77
now be used in the equations we have already derived for switching speed 
and voltage limits for each of the circuit types. If the open occurs in 
the driver transistor, Z is decreased and Z* is increased. If the open 
occurs in the load transistor, Z is increased and Z ’ is decreased. If 
the resistance of the open is very large (i.e., much greater than the 
resistance of a transistor), the output of the inverter will either be 
stuck-at 0 or stuck-at 1, depending on whether the open is in series 
with the load or driver transistor, respectively. As discussed in 
Chapter 1, high resistance opens in CMOS NAND or NOR gates and NMOS 
gates fed with pass transistor logic, result in stuck-open type faults. 
If a high resistance short occurs in NMOS NAND or NOR gates, either one 
of the inputs appears to be stuck-at 0 (driver transistor open), or the 
output appears to be stuck-at 0 (load transistor open). In the dynamic 
circuit, high resistance opens cause the output node to appear to be 
either stuck-at 1 (driver transistor open), or stuck at 0 (load transis­
tor open). A high resistance open of the coupling transistor in general 
leads to unpredictable behavior.
Low resistance opens in series with the gate terminal of a transis­
tor significantly reduce the speed of NMOS and CMOS inverters. For an 
NMOS inverter, the capacitance of the driver transistor's gate must be 
charged though the open. In a CMOS inverter, two cases are possible. 
If the short affects both driver and load transistors, then the capaci­
tance of both transistors must be charged through the open. If the 
short affects only one of the transistors, then the shorted transistor 
turns off and on more slowly than the other transistor. An open gate 
terminal to the depletion load transistor of an NMOS inverter has very
 
at rea i
l i i it s, r
i t r, z' .
I 
r i r,  r ,
tan L ., uc t  
tan i t r), t t rt ill
- t O - t , i het er
it i t r, cti l .
hapt r , tan t
t i i i , lt uc o y lt .
tan rt r  t s,
t  ear - t O ta n),
t t ear - t O lo i n). i
it, tan t t r  
- t r n), u t O lo i -
en).  stan f l i eral
redi t l avior. 
stan it t i l ra i -
i t ert rs. r
ert r, i i t r' t ust
ho .  rt r , ssi le. 
rt t t o i r , n aci-
a t ust hro .  
rt t l i r , r r i
f or o l r r i r. t
er i l l t ra rt  
78
little effect on circuit operation [4] . The primary reason the gate- 
to-sonrce connection has so little effect on circuit operation is due to 
capacitive feed-through from the source terminal to the gate terminal. 
A parasitic capacitance exists between the gate and source of a transis­
tor. Any rapid change at the source terminal is coupled to the gate 
terminal. It is difficult to predict the circuit behavior if a deple­
tion transistor's gate should open completely. If no signal levels 
change on the chip for a long period of time, the charge will eventually 
leak off the gate [28]. Charge leakage causes an n channel device to be 
off and a p channel device to be on. This analysis, however, fails to 
account for capacitive feed-through. Any transistor whose drain or 
source is connected to a clock or other rapidly changing signal will 
experience capacitive feed-through to the gate. As a result, the gate 
voltage will be constantly changing. Whether or not the gate voltage 
ever gets above (below for a p channel device) the threshold voltage 
will depend on the particular details of the circuit. Since a large 
percentage of the transistors in dynamic circuits have a source or drain 
connected to a clock signal, capacitive feed-through is an important 
factor. If the gate is connected to a long interconnection, and the 
open occurs at the end of the interconnection away from the gate, then 
the interconnection will act as an antenna collecting all the noise and 
other signals in the vicinity. This essentially random signal drives 
the inverter which in turn amplifies it and distributes it to other cir­
cuits.
 
t c .it r t o
o s u ect t it er t
Capacit throu ro111 er i l t i al.
i i i t o s ra i -
r. r i l l o t
i al. i lt i t it avi r lo-
io si t r' s l npletel . l l
 o  i , a: - ill t al
e f t ). har ea  el i
el i . hi l sis, ever.
t cit hrou . i hos
r t r i l ,ril 1
r acit throu t . a lt.
l ill st t i . hether t t l
r s t o  el i ) l
il 1 rt l t il rcu.i i a
t ra i i  
t  al. acit throu port t
t r. t ect o erc ecti ,
r t  nn.   rom t . h
erco t ill t l l i
l i i it . i ti l andom al
rt hi U!plif i ib r -
it . 
79
3.-3.3. Response of Circuits to Noise
During normal operation, an integrated circuit is constantly 
exposed to noise. This noise is of two types, random noise dne to vari­
ous physical processes (we call this physical noise) and capacitive or 
inductive coupling of signals as well as any external electrical distur­
bances (we call this coupling noise). The most common types of physical 
noise are thermal noise, shot noise, and quantum noise [39]. These 
types of noise are usually modeled as an independent random white Gaus­
sian process. Wallmark [40] has developed a statistical model for capa­
citive and inductive coupling. For large circuits, especially those 
consisting of a large percentage of random logic, he shows that the cou­
pling noise may also be considered as another random noise source. He
also treats device variations (random fluctuations in geometric and pro­
cess parameters) in a similar fashion. Under Wallmark's assumptions, 
the total rms voltage due to all sources of noise is three to four times 
the value of physical noise alone.
For proper operation, the circuit must be designed to work
correctly in the presence of noise. Although it is impossible to make a 
circuit totally immune to noise, it is possible to make a circuit rela­
tively insensitive to noise. Usually this is done by making the abso­
lute value of the gain of a circuit small both for values of close
to logic 1 and for those close to logic 0, while the absolute value of 
the gain at the transition point is made as large as possible. The
absolute value of the gain for equal to a logic 0 or 1 must be less
than 1. Otherwise, noise is amplified rather than suppressed. Ideally,
 
!-1-1- es s  su ir it  12. .Qj_u_ 
uri al erat , n e it st t
i . i i s, andom i u ri-
si l l si l i cit
t l l ell t l t l i -
s ll l is ). ost om y si al
i h al i , t i , um i ]. s
i al odel n t andom hit aus-
ess. all ar ) l odel -
v t pli , r a it , ci l
si large percentage of random logic, be s t  -
! i a al so be considered as another random noise source.  
t i i random u o etri -
eters) i . nder a l ark' s pti ns,
l s l l i im
si l i . 
r r operation, the circuit m  be igne  to or  
r t i . l possi l a
it m i . ssi l a it -
iv si i . suall aki -
l it all t V.  lD 
i l h , hil l t
i transit on point is made as large as p   
l t l Vin al i O ust  
. t er ise, i plif r ss . ll , 
80
the gain at these points should be close to zero. The absolute value of 
the gain at the transition point should be as large as possible to pro­
vide a sharp transition from a logic 0 to a logic 1. Other techniques 
for maximizing noise immunity are using a large supply voltage and mak- 
ing close to the midpoint of the voltage swing.
Long-term exposure to radiation and hot electron injection 
increases a circuit's susceptibility to noise. As mentioned in Chapter 
2, noise levels in transistors increase after exposure to radiation. In 
addition* radiation exposure and hot electron injection cause shifts in 
the threshold voltages and a decrease in transconductance. These param- 
eter shifts nay result in values of Vlo and vki closer to the transition
point. As and Vjjj move closer to the transition point, the circuits
fed by an affected gate tend to amplify the noise to a greater extent.
A reduction in transconductance also tends to reduce the gain of the 
inverter. Lower gain also reduces an inverter's noise immunity.
For well-designed circuits operating normally, the effect of noise 
should be soft errors very similar to radiation-induced soft errors. On 
very rare occasions, a noise spike may be large enough to change the 
value of an output. As is the case for radiation-induced soft errors, 
dynamic circuitry is more susceptible than static circuitry.
3.4. Response of Good Circuits J& the Output a l  s. Failed Circuit
As shown in the last section, there are a variety of ways a circuit 
may behave under failure. In some cases, the output of a circuit is a 
legal logic value although it may not be the correct one, e.g., outputs 
may exhibit stuck-at or stuck-open behavior. In these circumstances, we
 
i t l , l t
ra io i t l a ssi l -
ra io rom Ot i . t r e
axi izi i unit s l l ak-
n ~n i point l s i . 
g- rm 0%posur o t ro o
it' eptibili i . enti hapt r
, i l r i t .
it , i o s r t o i
l ransco t e. os -
i ma lt 10 hi o r io  
i t, s Vlo bi ov i o i t , o it  
t e 11lif i t u: t. 
  ransco t  e
ert r. er ert r' i unity. 
r ell g it r t r a l . t i
l ft il ion-indu ft r .
si s, i a a
t ut, s ion-indu ft r ,
i i or ti l it . 
1•! • es ~! .fuu2.. . ir it .t.2 .1lu. ut t 2.f. A i ir it 
s o t t , ri a s it
a er . o s. t t it
l i thou a t r t e, . ., t t
a i it - t tuc o avior. rcu st ces,  
81
know the response of good circuits which must process the failed
circuit's output. The output from the failed circuit is a legal logic
value and is processed just as any legal logic values from good circuits
would be processed.
However, many of the failures that we have examined may result in 
outputs which are not legal logic values. Under a variety of failures, 
it is possible to produce a steady state output which is between V® and 
(undefined constant logic value). Another possibility is a timing
failure. Synchronous systems are designed so that all signals are 
steady when a clock pulse or edge occurs. A timing error may violate 
this constraint. A related type of failure is oscillation. When oscil­
lation occurs, the steady signal constraint is once again violated. All 
three of these types of failures have one important attribute in common;
circuits which process these signals are unable to interpret them reli­
ably as being either a logic 0 or logic 1.
3..4.1. Metastable Operation
During normal operation, a system undergoing a state transition 
shifts from one stable state to another. Unfortunately, under certain 
circumstances, it is possible for the system to be left in a metastable 
state. A system is at equilibrium when it is in either a stable or
metastable state. In a stable state, a small disruption will cause the
system to react in a manner which restores the system to its original 
state. The larger the disruption, the larger the restoring force until, 
for a disruption which is large enough, the system changes state. In a 
metastable state, if a disruption is applied, the system will react by
 
response of od i it  hi  ust le  
it' output. The o p  fro  the failed it l i  
is processed just as any lega  logic values from good cir uits 
oul , 
o ever, a u t i a lt
t t hi t l l s, nder ri ,
ssi l ea t t hi vO
Vl st t o alue). nother ssibili im  
. r s e t l l
e l rs. im a l
nstrai t.  u il . he il-
io rs. ea l str i t i i l t . ll
y- e lu one im ant attribute i  c mon; 
it hi these sign s are unable to interpret  li-
i O . 
l-1· · etast l nerat  
uri al er ti , tem r r io
i rot11 l t er. nfort natel , er r
rcu st ces, ssi l em etast l
.  tem equili wn it is in either a stable or 
etast l . stable state  a .1m.l.ll. disr pti n will canse t e 
tem t anner hi tem l
. i t , in til,
hi , tem .
etast l , li , tem ill t  
82
forcing itself further from its metastable equilibrium condition toward 
some stable state. Eventually, the system will come to rest in a stable 
state. Unfortunately, the system may remain in a metastable state for 
an unbounded time period.
As an example of such a system, consider a bistable element. Such 
an element can store one bit of information. A power or energy function 
is associated with any such element. For an inverted pendulum or other 
mechanical bistable element, this associated function is the system's 
potential energy. For a flip-flop, this associated function is called 
the dissipative function (see [29] for a discussion of the dissipative 
function). A stable state is represented by a local minimum in the 
element's associated function. A metastable state is represented by a 
local maximum. A bistable element must have two local minima 
corresponding to its two stable states. For any continuous function, 
however, between any two local minima, there must also exist at least 
one local maximum. Therefore, between any two stable states, there must 
always be a metastable state.*
A certain amount of energy or power (depending on the memory ele­
ment) is required to switch the state of a flip-flop. If the input sig­
*
By flip-flop, we mean a static restoring memory element. We do not 
use the term flip-flop for a dynamic memory element where information is 
stored as charge on a transistor. A dynamic memory element has a con­
stant dissipative function. Such an element has an infinite number of 
stable states. Any disruption to such an element, no matter how small, 
will simply move the element to another of the infinitely many stable 
states. Due to the flatness of the dissipative function, the element 
has no restoring or nonrestoring response to a disruption. In a dynamic 
memory element, there is no distinction between stable and metastable 
states.
 
n l rom etast l i ibrium dit o
o l . ventua l , tem ill t  l
. JLfortunately, tem a ai •
im ri . 
pl e , si r l e ent .
a t it ati . er t
i it ent. r lli
echanical l e ent, i t '
t nti l , r ip lo , i t  
t o ] i
o h l i im
ent' i t . etast l
l ::r:i mn. i e e t ust l i
esp . r t n t ,
ever, l i i a. ust  si t t
l a . erefore, , ust
etast l .• 
 ount er i o e or -
ent) i i ip , t -
• ip , ea n e or ent. e t
erm ip-flo i e or e e t her n o at  
i t r. i e or e t -
t i llllcti , e t i ber
. o e ent, att a l.
,rill l ov  1 ent t r a
. atn f t , e ent
n rest p t . na111ic
e or ent, a in io l etast l
e . 
l -
83
nal does not have quite enough power or energy to complete the flip- 
flop's transition, the flip-flop may be left in a metastable state. 
Such a pulse is called a runt pulse. A runt pulse lacks the duration 
and/or amplitude required to change the flip-flop state reliably. In a 
properly designed system, there are only two ways that the system will 
be left in a metastable state: a synchronization failure, or a component 
failure. In both cases, a runt pulse is presented to a flip-flop.
Synchronization failures result when a synchronous system must 
accept a nonsynchronous input. Such an input may change at any time 
with respect to the system clock. For a synchronous system to work 
properly, all inputs must be stable before the clock pulse arrives. In 
order to accomplish this, the asynchronous signal is usually presented 
first to a clocked flip-flop. Unfortunately, as we have already shown, 
the flip-flop has a metastable state. If the asynchronous signal should 
change during a very small window with respect to the clock, the flip- 
flop may be left in a metastable state. Such a situation is referred to 
as a synchronization failure. Notice that such a synchronization 
failure can occur without any part of the circuit experiencing a physi­
cal failure.
Several suggestions have been proposed to prevent synchronization 
failures. One approach is to design an asynchronous network to perform 
the synchronization function. One such network is a time-bound arbiter. 
Unger [41] has developed a technique for designing asynchronous networks 
including time-bound arbiters. Unfortunately, Unger's technique depends 
on the use of a device called an inertial delay. There is some question
l t it er plet
' i , ip-flo a etast l .
l  t l .  t l r t
/ r pli i ip flo l .
erl e . l a s t tem ill
etast l : r ni at . ponent
. t s. t l ip . 
nchronizati u lt tem ust
t t. t a h1
it t tem . r tem
erl . l t ust l s.
r pli i , l al t  
ip lo . nfort natel . rea v ,
ip- o b etast l , l l
r all i i t ,
lo a etast l . n  tu io
r ni t . oti t r ni at
u r it t rt it er si-
l , 
eral est t r ni t
. r form
r ni at t . r im b n it r.
nger ] ec r
n u im b it r . nfortunatel , nger' ec
i rt l l . er o st  
84
as to the realizability of an inertial delay. Marino [42] has investi­
gated three proposed inertial delay designs and has demonstrated that 
they are all unreliable. In addition, Strom [43] has shown that a 
time-bound arbiter and an inertial delay are equally realizable since an 
inertial delay may be built from a time-bound arbiter and vice versa.
In a more general study, Marino [44] has proposed an extremely gen­
eral model for any system that exhibits sequential behavior. The only 
restriction imposed by this model is that the system is nonanticipatory. 
Using this model, Marino has shown that unless certain relationships 
between the inputs can be guaranteed, metastable operation is unavoid­
able.
Several techniques have been proposed to eliminate synchronization 
failures. The only proposed technique which will prevent synchroniza­
tion failure was first suggested by Chaney et al. [45]. This method 
uses flip-flops to synchronize the asynchronous inputs. Instead of 
attempting to prevent the input flip-flops from entering a metastable 
state, circuitry is included to detect a metastable state. If this cir­
cuitry detects a metastable state, the clock signal is delayed until the 
metastable state is resolved. Such an approach is clearly not satisfac­
tory for all applications, since an adjustable frequency clock is 
required. In addition, the maximum clock period is unbounded since the 
time for a flip-flop to exit from its metastable state is also 
unbounded. A practical compromise is to reduce the probability of syn­
chronization failure below some "acceptable" level [46,47,481.
~ ili rti l . Xari • osti-
rt l onstrat t
h l reli l . it , rom ◄ 1 ho t
im b i rti l al l 1
rt l a ilt rom im b i r . 
oro er l u # ari ] e l -
l odel tem t i it enti l avior. l
io s odel l t a nantici at r ,
si odel, ari e ho t l r ion
t 1 etast l r t UD.avoid-
l . 
eral e i i t r nizat
. l e hi ill t r ni -
io a gu t . ( ]. hi et
ip flo r i ts.
tem t t t ip-flo rom t etast l
, i u t t etast l . i -
i t t etast l . l l til
etast l l .  t -
l li t s. l requ  
i . it , ■Ulll r  
im ip-flo it rom etast l
ded.  ti l pr is abili -
r i t u o t l l  , ]. 
85
Another source of metastable operation is component failure. If a 
failure occurs* the timing and/or voltage levels of signals produced by 
the failed components may present a runt pulse to a flip-flop. Oscilla­
tion at the input of a flip-flop can also cause runt pulses and thus 
metastable operation. Regardless of whether the failure is a synchroni­
zation failure or a component failure, metastable operation is caused by 
a runt pulse being presented to a flip-flop.
Researchers have found two modes of metastable behavior in flip- 
flops [49,50] . In one mode of behavior, the outputs of the flip-flop 
remain for some time at a level between V® and V*. In the second mode 
of behavior, the outputs of the flip-flop oscillate. In both cases, 
other gates receiving the outputs of the flip-flop will be unable to 
interpret the flip-flop's state reliably. Some gates may interpret the 
state as a logic 0 , while others may interpret the state as a logic 1 . 
Still others may themselves produce an illegal logic output.
A variety of researchers have examined the probability of failure 
due to metastable operation [29,46,50,51], Unfortunately, in these 
prior studies, only synchronization failure is considered as a cause of 
metastable operation. Generally, a synchronizing flip-flop is con­
sidered for processing an synchronous input occurring at some average 
frequency f. In [29], it is estimated that metastable operation will 
occur if the asynchronous input changes within a window of width
• The value of is that of the cross-coupled gates whichave ave
form the flip-flop. Therefore, metastable operation occurs with a pro­
bability of O.lfrg^ per synchronization event. Clearly, the faster
ave
i 
' 
. 
•• 
• . 
BS 
not er r etast l r t ponent .  
urs, im / r l l l
e ponents a t 1lllt l ip lo . scill -
o t ip-flo t l
etast l erati . egardl het er u r i-
o ponent , etast l r t
t l ip . 
es r ers o odes etast l i r
o SO] o avior, t t ip flo
ai o im l vO v1 • o
avior, t t ip-flo cill t . t s,
t t t ip flo ill l
r t ip l ' l . t a t
hil r a t i
till t r a h sel le l t ut. 
ri r ini abili
etast l r t , , , ). nfort natel ,
i s, l r i t si
etast l er t  . enera l , r i ip flo -
t rr o
eq . ], im t etast l r t ill
r t it i i i t  
O. l~SW ~SW t c l t hi  
 
orm ip o . heref re, ~ t l r t r it -
bili  .lf't'SW r r ni t nt. l rl ,  
 
the gate used in the synchronizer flip-flop, the lower the probability 
of the flip-flop entering a metastable operation. On the other hand, 
this advantage is lost if a faster synchronizer is forced to synchronize 
more events (i.e., f is higher).
The probability of a synchronizer leaving a metastable state before 
some time t, is usually modeled as a Poisson process with rate p 
[29,51]. Under a number of simplifying assumptions, it can be shown 
that [51,46]
p = j t *—  
ztswave
Therefore, for everything else equal, the lower the value of x ^
ave
(i.e., the faster the flip-flop) and the higher the value of A*, the 
lower the probability of failure from synchronization failure. The pro­
bability of a synchronizer being in a metastable state at time t, p(t) 
is
2t jp!*
p(t) = 0 .1 fTsw e aV®
ave
These equations should be used with caution. They were derived under a 
number of simplifying assumptions. Lacroix et al. [52] found a three 
order of magnitude difference in the average length of metastable opera­
tion for 7475 D latches from different vendors. It is quite doubtful 
that the gain-bandwidth product would vary enough to account for this 
difference. Pechoucek. [50] found that from a random sample of 74S74 
flip-flops, the flip-flops with the smallest delay exhibited longer
t r ni r ip lo , o er r babilit  
ip-flo inB etast l erati . d. 
t t  r ni r r i  
or t  ••  a er), 
abili r ni r e etast l f r  
o im al odel i it
( . nder lllllber pli pti ns,
t ( , ] 
• p  .A. 2 t's  
 
heref re. t al. o o er f t' SW 
 
. ip flo r o •.
o er abili rom r ni at . -
bili r i r etast l t im (t
 
 (  )   ,: SW
 
; 
es at l it ti , er er , 
ber pli pti ns. acr h: t .  ]
r agnit fere e etast l r -
io  c rom i t ors. i btf l
t b i t ct oul t
, choucek ] t rom ando pl 8
ip lo , ip flo it  a lest i i r 
87
average length of metastable operation than those flip-flops with a 
larger delay.
If we make the assumption that during a timing failure, the added 
delay modulo the clock period is uniformly distributed, then the same 
equations derived for synchronization failure will also apply to timing 
failures. This assumption is somewhat tenuous. One could reasonably 
expect the actual probability of metastable operation due to a timing 
failure to be several times higher than that predicted by the synchroni­
zation failure analysis. Since most device failures affect both timing 
and logic levels, it is very difficult to estimate the probability of 
metastable operation due to a component failure. If, however, the com­
ponent failure does not occur in the flip-flop itself, then the average 
length of metastable operation should be the same regardless of what 
caused the metasable operation in the first place. Based on the equa­
tions, the gain-bandwidth product of the flip-flop gates should be as 
large as possible in order to minimize the average time of metastable 
operation. If NMOS gates with nonsaturated loads are used, then the 
gain of the flip-flop gates will be limited. To increase a gate's gain, 
we must increase the value of Pr. Since Z is proportional to pr, any
increase of also increases Z. For large Z, x is proportional to
ave
Z. Therefore, if is large, the gain-bandwidth product actually
decreases by a factor of approximately p* /2 as Pr is increased. Better 
synchronizers can be built using NMOS with a saturated load or CMOS. 
Both of these types of gates will have inherently larger gains than the 
nonsaturated NMOS gates. Any attempt, however, to increase the gain- 
bandwidth product of a saturated NMOS or CMOS gate by increasing channel
 
etast l r t h ip-flo it
l . 
a stlll1pti t r im .
odul i l i ib . a
t r flizat u ill l im  
. i s'll .pti nte bat e s. l l
ect t l abili etast l er t n  im
lu r l im r h t i r ni-
al sis. ost i u t t im
l . i lt im t r abili
etast l er t ponent . . ever. -
e t u t r ip flo l , h
etast l r t l l hat
etas l er t . -
, b i t ct ip flo t l
ssi l i i i im etast l
erati . t it satnr o , h
i  ip o t ill i i .  t ' i .
ust orti l l3r•  
,r . r a , ~SW ort l  
 
z. herefore. if a. 
"'r is . b i t ct t l  
i at l 13!12 ~ . ett r 
r ni r ilt i o OS.
ot t ill t i h
sat r t s. te pt, ever, i -
i t ct t el 
88
length is futile. As the channel length is increased# gain increases# 
but switching time decreases due to increased resistance and capaci­
tance. The implications of scaling on metastable operation is examined 
in [51]. If the number of devices are increased by the scaling process 
and clock speeds are increased as gate propagation delays decrease# the 
average length of metastable operation is roughly invariant.
1 -4 .2 .  Rft.spms.fi. sl gjapfripfttiQhsI Logie
Combinational logic is only susceptible to timing errors if it is 
part of a sequential machine or if it must produce its output in some 
bounded period of time. Unfortunately# nearly all cases of practical 
interest are included in this case. In addition* combinational logic is 
susceptible to oscillation and illegal constant logic values.
If the combinational logic is part of a synchronous sequential 
machine* then a timing failure may result in incorrect behavior. In 
order for a synchronous sequential machine to operate properly# it is 
necessary for all outputs from the combinational logic to have reached 
their steady state values before the arrival of the next clock pulse. 
In the event of a timing failure* one or more of the outputs may be in 
the process of changing at the same time that the clock pulse arrives. 
In this case, it is not possible to predict whether the affected combi­
national logic outputs will be interpreted as a logic 0 or a logic 1 . 
Any outputs which are delayed may be incorrectly interpreted. Although 
asynchronous sequential machines do not depend on all internal signals 
settling before a clock pulse arrives, they are still susceptible to 
timing failures. A large class of asynchronous circuits have essential
1 
e i ti . s el e . s .
t i im stan ci-
, pli t etast l r t ui
Sl . Wllber i
n t at r s ,
e etast l r t l ri t, 
.! ~•l• es onse i ~ombinational 1 c 
o binati nal i l ti l im
rt enti l achi ust t t
r im , nfort natel , r t l
t n lu lo , it , binati al i
ti l il io le l st t i l s. 
binati al i rt enti l
achine. h im u a lt r t avi r .
r enti l achi er t erl ,
t t rom binati al i
i ea l z:t o l .
t im , or t t a
i im t l .
, t ssi l i t het er o bi-
t l i t t ill n e i O i  
t t hi a r t . l
enti l achi es t l l l
in f r l s, ti l
im .  i ti l 
r 
I 
89
hazards which cannot he eliminated* In these circuits, excessive delays 
in part of the circuit may result in an erroneous state transition. 
Furthermore, asynchronous sequential circuits are usually designed under 
the assumption that after an input changes, all signals in the circuit 
settle before further input changes occur. Timing failures can lead to 
the violation of this assumption.
Any node which oscillates will cause other nodes which are sensi­
tized to it to oscillate also (the conditions required for sensitization 
are discussed in Chapter 4) . If the outputs of a combinational logic 
block are sensitized to an oscillating node, then they may be inter­
preted as either a logic 1 or a logic 0 by following logic.
If the input of a gate is at a voltage close to its transition 
point (i.e., an illegal constant logic value), then its output voltage 
may also be close to its transition point. Similarly, other inverters 
which receive this inverter's output as their input may have their out­
put voltages close to their transition points. Consider a string of n 
identical inverters where the first inverter's input is at its transi­
tion point. The first several inverters will have output voltages which 
are close to the transition point. Any noise present in the system will 
tend to force the inverter's output voltage away from its transition 
point. Intuitively, inverters at the beginning of the string would be 
expected to have a relatively high probability of being close to the 
transition point. Inverters further down the string would be expected 
to have a much lower probability of being close to the transition point 
due to each inverter's amplification. Inverters at some distance from
' 
 
hi t b i i t . it , l
rt it a lt o i .
r ore, enti l it al UDder
pti t n t es, l l it
t t r. i i lu e
i l pti n. 
hi il ill r hi si-
iz il dit i si o
hapter ). t t binati al i
si z il in e, h a -
i i O low i , 
t t l ra io
i t L . le l st t l e), t t l
a ra io i t. i ilarl , rt
hi ert r' t t i t a i t-
t l i r io i ts. onsi er 11 in
ti l rt her ert r' n t i-
io i t. r l rt ill t t l hi
ra io i t. i t tem ill
e ert r' t t l rom ra io
i t. i , rt i i in oul
t iv abili
io i t. ert in WO' ll t  
uc o er abili r io i t
ert r' plifi at . rt a rom 
90
the beginning of the string would be expected to oscillate between
and V-lo*
We now develop a simplified model in order to determine the approx­
imate probability that a given inverter^s output is greater than V* or 
less than V®, Figure 3.14 shows a string of inverters and the simpli­
fied model which we use. Each inverter is modeled as an ideal finite 
gain* finite bandwidth amplifier. Each amplifier has a transfer func­
tion H(<o). At the input of each amplifier is a summing point where 
noise from a noise source is added to the output from the previous 
amplifier. Each noise source is assumed to be a Gaussian white noise 
source and each source is assumed to be statistically independent of 
every other source. For the sake of convenience* the transition point 
is taken to be zero while is assumed to be negative and V* is assumed 
to be positive.
Referring to Figure 3.14, the response at point y due to some noise 
source i ( 1  < i < n) is
Ti<o>) = [H(u) ] n+1
The power spectrum density of yi [39] £s
S = lH(*)l2 (a+1 -i>Sw df
i i
while the variance is
VAR = f°° |H(»)|2 <n+1-i>S df
yi wi
The central limit theorem states that when several independent random
;/
1
variables are summed* the variance of the sum is the sum of the vari-
in oul t il  Vhi 
10 • 
e  pli odel r r i o%-
atc abili t ert r' t t t h v1 
v0 • r in rt pli-
e odel ,r i . rt o  1 l
i , i t plifier. plifi a -
io (co), t t f plifi r &WDDli i t her
i rom ia t t rom
plifier, i u aussi hit iu
su ic n t
r r e. r eni ce, ra io i t
a hil vO su ati v1 su
sit . 
efer i r , t int o i
 i i )  
Yi(~)  [ (w)) l- ifl(•} 
er rum si Y ) i  
hil  
~ J luc~>l ( +l- )sw. f 
- 1 
tr l i it heorem t r l n t ando  
 1 ed, um um  
I I 
,-1 
I 
J 
I • 
91
Figure 3.14. Model of Inverter String.
 
(> ---(> - ·; 
., - . 
.v ,v A' N , 
- ., 
-1"':1... - - ~ ~~~---(~ L ~ ~ -- / '<J V _ _,; v-
i r . . odel rt t . 
/92
ances [39]„ Therefore * the variance due to all the noise sources is
VARy = J j  lH(U)l2 <“+1 -i>Sw df 
i=l 1
We assume that every source has identical statistics (i.e.* S _ g
W1 1
e* 0 “ = Sw = Nq / 2 ). The variance is nown
VAR _ ~  2  f “ lH(ft>) |2(n+l-i)df 
i=l
If the amplifiers are assumed to have a single pole, then
H(ti>) = - A. , I + jmy
and
|H(o))|2 = --- A-
1 + jto^ y^
where A is the gain of the amplifier and y is the reciprocal of the 
amplifier's cut-off frequency. Using the fact that to = 2nf, the expres­
sion for variance can be rewritten as
Nn n co
VAR = =£ X A2 <”+1 -i> f [ --- 1---  ]»+l-idU
7  2n /, J 0 1 + »2t 2
Using a table of integrals and simplifying, the variance is
VAR = — — —  \  —  A— [ gamma(n - i + 1.5) ] 
y 2yjrl/ 2 ^ _2n + 2 i - 1  gammaTn - i + 1 )
In order to avoid the gamma function, we developed the following approx­
>2 (n-i+l)
imation
}. heref r . o o l o i  
f• ruc~)l (n l- i)s f 
- wi 
e s1111De t t l . , • s_ 
• "2 
- ... - s w 
 
y o/  • o  
VAR.1 • 2 
A CD 
l J 
i l -• 
o plifi r su l l , h  
w    JIUY 
 
I (1»)1   .2 
c 2 2 
hor  i o plifi T o l
plifier' t f e . si t co f, r s-
r rit  
VAR No ~ A2(n+l-i) 
y = 2JT L. 
si l pli ,  
y 
l(n-i+l) c om <~ - i  .  1 
- m    
r i am t , o low -
at  
93
.?) ss rx _ 2 5 +gamma(x) Lx +
_]l/2
(2500x)1/2
For integer values of x between 1 and 70, the error is less than 1 per­
cent for this approximation. If we substitute this approximation into
the equation for VAR^ we f£n(j
VAR N0 “ ,2 (n-i+l) r2ynl/2 } -2.1 + 2 1 - 1  
' 1=1
- i + 0.75
ALet P be the probability that y <. -a or y >. a. In other words, P is 
the probability that y is at least a distance of a from the transition 
point. Since y has a mean of zero, it is easy to show that
p*(a. N0> y. A, n) = 1 - erf(a[2VARy] 1/2> 
where erf() is the error function. P* is most strongly dependent on the 
inverters gain, A. This dependence is due to the fact that the vari­
ance of y is proportional to . As n becomes large, this factor
will increase rapidly as A increases. Figure 3.15 is a graph of P* vs n 
as A is varied from 2 to 100. The value of y is based on a SPICE simu­
lation of an inverter designed using Mead-Conway [293 design rules for a 
5 micron process. a was arbitrarily chosen to be 1 volt while was 
roughly equal to the thermal noise present at room temperature. As 
pointed out in [40], the total noise in the circuit will probably be 
several times this value. The SPICE simulation of the Mead-Conway 
inverter had a value of A equal to 2.27. The graph shows that P is 
very close to zero for small values of n. When n increases beyond a 
certain value, P increase rapidly until it becomes almost equal to one.
 
011111 ( I + • 5) :::: [     1 ] 1/  
gam a (x} x • (2S00x) 1/ 2 
r l  ,  r-
t r i ati , sti 11roxi ati  
t  ati   Y'  ind 
 y 
o n ,2(n-i+l) 
'\: - - -·-- - [ i + 0.75 1/  L - n  i - 1 ° -2Y7f i l 
+ (2500(~ - l ♦ l)}l/1 J 1/2 
et p• abili t i ci  L . r ords, p* 
abili t  t t a rom ra io
i t.  e r , t 
p*( N - 1/2) 
, O• T• ,  :  f(a{2VAR 1 
her t . p• ost ro e t
' i , . i t t ri-
 rt l A2 (n+l) s es ,  
ill s. r p•
r rom . r  im -
io rt r g ea )
S i ess. as i lt hil N0 a
l al al i t oo perature. s
i t  t ], l i it l
r l im l . l t ea -
rt  al  . . t p• 
all ,  
l , p• til es ost al e. 
94
\
A =  100 A -  50 A = 1 0  A = 2.27 A = 2
P*
1
Figure 3.15. Probability of lyl >. a vs. Number of Inverters for Small 
Noi se .
.,.  • 10  ,..  
-- - --- -----
• 
,.. 
-- -/ ,,, 
I / I 
I 
I I 
.8 I I I I I I I I I I I .6 I I 
I I I I I I I I I .4, I I I I I I I I I I I .2 I I , I 
I 
I I 
I I 
0 
0 5 10 15 20 25 30 35 
n • num~r of inverters 
r . . r bili l l . ber ert a l
oi  
I -
~ 
95
The value at which P* begins its rapid rise is highly dependent on A. 
The larger the value of A, the sooner P* begins its rapid increase. The 
slope during this increase is also larger for larger values of A.
In Figure 3.15, we assume a very low value for Nq o Figure 3.16 is 
another graph of P* as A is varied from 2 to 100. The values of a and y 
remain the same in Figure 3.16, but the value of No is increased by a 
factor of 10 from the value used to derive Figure 3.15. The value of Nq 
used for Figure 3.16 is probably much larger than the actual total noise 
in a circuit.
The graph of Figure 3.16 is very similar in shape to the graph of 
Figure 3.15. The only appreciable difference is that the graph of Fig­
ure 3.16 is shifted roughly one inverter to the left with respect to the 
graph of Figure 3.15. In other words, the effect of increasing the
value of Nq j,y a factor of 1 0 is approximately the same as the effect of 
adding one more inverter to the chain. By examining the equation for
VA^y, it is apparent that an increase in A, Nq , and n leads to an 
increase in the value of VAR^ an(j thus P*» Likewise, increases in a and 
y decreases the value of both VAR^ an(j p*. In order to maximize the 
value of P*> circuits should be designed to maximize gain and bandwidth. 
Note that gain is more important than bandwidth in maximizing P .
It is important to consider the behavior of a node when its voltage 
leaves the range of + a. As long as the node voltage is small, the 
response of each inverter is approximately linear. The input to the 
first inverter is white Gaussian noise. The output of the first 
inverter will be colored Gaussian noise. The frequency components that
 
 l t hi p• i e t .
ln , r p• i .
o r l . 
i r . , su o 0 ,
t r f p•  r rom .  l   
ai i r  . , t l 0 n
 rom l i r . . l 0 
i r l uc l i
it. 
i r il
i r . , l r i l i e t i -
i l rt t i t
 i r  . . ords, t n  
O b i atel a t
i or rt i . :r.amini t  
RY. ar t t , o
 d p•.y 
-y l t  d •.y 
i ise,  
r axi i
• , it l g a:r.imize idth.
ot t or  port t i t axi izi •. 
port t si r avi r h l
e ±. . s o l a l.
f rt i at l r. t
rt hit aussi i . t t
rt ill aussi i . requ ponents t 
96
A = 100 A -  50 A «  10 A «= 2.27 A -  2
P*
1
Figure 3.16. Probability of lyl J> a vs. Number of Inverters for Large 
Noise.
 = .= -  =  
-- - --- -----
• 
/ ,, ,, 
I 
I / I I 
.8 I 
I I I 
I I I I I I I I 
.8 
I 
I I I I I I I I 
.4 I I I I I I I I I I I I I .2 I I I 
I 
I I 
I 
I 
0 J 
0 5 10 15 20 25 30 35 
n • number of inverters 
i . . abili l l . be rt
oi  
97
\
are removed from the output noise are those frequencies that are too 
high for the inverter to respond to. The colored Gaussian noise from 
the output of the first inverter is added to white Gaussian noise and 
then input to the second inverter. The output of the second inverter is 
once again colored Gaussian noise. This process is repeated as we pro­
gress down the chain of inverters. Finally, the colored Gaussian noise 
at the input of one of the inverter is large enough so that the linear 
response assumption is no longer valid. Since the noise from the previ­
ous inverter is so large, we may neglect the white noise which is being 
injected at this node. Therefore, this inverter is being driven by a 
fairly large colored Gaussian signal. By large, we mean that the signal 
is large enough to saturate the inverter. The frequencies present in 
the Gaussian signal are low enough for the inverter to respond to. 
Therefore, the output of this inverter will be a "clipped" version of 
its input. Since the colored Gaussian noise is a zero mean process, the 
inverter outputs will oscillate.
In this analysis, it is assumed that the input to the first 
inverter in the string is exactly at zero (i.e., its transition point). 
It is more likely that there will be some small offset, e, from 0. The 
effect of such a DC offset is to change the mean of the input signal 
from 0 to e. Likewise, the output signal's mean is changed from 0 to 
eA. In general, the output from the ith inverter has a mean value of 
eA1, Since the signal at y no longer has a mean of zero, the probabil­
ity that y I a is no longer the same as the probability that y (_ —a. 
Due to the symmetry of the problem, we may, without loss of generality,
.... 
.... 
 
e rom t t i e i t
rt esp . aussi i rom
t t rt r . hit aussi i
t ert r. t t rt
aussi i . i -
i ert rs. i all , aussi i
t rt t in
u pti o r li . i rom i-
rt , a l ct hit i hi
e. erefore, rt
aussi al. , e t l
ert r. req i t
aussi l o rt sp .
erefore. t t rt ill ipp f 
t. aussi i ea ess.
rt t t ill cill t . 
al sis, su t t
rt in t t . , ra io i t ) .
or ik t ill o all f t. . ro111 O.
t t e t l
rom O £ i ise, t t al' ea rom O 
c . eral. t t rom rt e
£ i. l  o e r , abil-
t  l.  r a abili t  f - .
ylDl!letr , ay, it t eralit , 
98
assume that a is positive. In this case# the response at y is identical 
to our previous analysis except that sAn is added to the signal.
Let y* be the original signal at y (i.e., the value at y if e = 0). 
Therefore, y = y ( + aA*. The original value of P* was defined to be the 
probability that y 1 -a plus the probability that y >. o. Since the 
value at y is the sum of the original signal at y and eAn . Therefore, 
P is the probability that y' —a — sAn plus the probability that 
y' >, a - eAn. It is easy to show that
P* = J.U - erf[(o - 8An)(2VAR )-l/2] + 1 - erf[(-a - sAn)(2VAR
y y
As eAn becomes large, then
1 - erf[(a - sAn)(2VARy ,)~l/2] >> 1 - erf[(-a - eA»)(2VARy ,)~l/2 
Therefore, for large values of eAn, an approximation for P* is
P*<a’ N0, r. a , n, 8) ~ i-tl - erf [(a - eA”) (2VARy , ]-l/2)
As the value of e becomes larger, the probability that y is outside the 
range of -a to a increases.
This analysis demonstrates that if an illegal constant logic value 
occurs at a node, then other nodes that are sensitized to the illegal 
value node may either oscillate or also have an illegal logic value. 
The more levels of logic between the nodes, the greater the probability 
of oscillation.
In most cases, the output of a chain of gates drives the input of a 
flip-flop. If a failure has occurred so that the input of one of the 
gates is forced to its transition point, the flip-flop may enter a meta- 
stable state. If the output of the last gate in the chain is still
 
ss1ll!l t sit . ,  ti l
r l si t t a D al. 
t ' l l  Ct  ••  a~ .
eref re.   ' n. 1 l p• as
abili t  i l abili t  la
 u l l  a n eref re,
p• h r abili t ' i -ci - a abili t
' L a n. ho t 
p• ~ t{1 (a a (2VA y , 1 J a (2VA
1
,)-1/l) 
s es e.  
a lVA  -1 } a n (2VA  -1y' y' 
eref re, a l s %i ati p•  
p•cci. o. Y• A D, a - r1 a U) Ry•1-1/ J 
s l a es s r, abili t  t
s. 
i l si onstrat t le l st t
r t • t i z lle l
l a cil le l l e.
or l i es, t abili
il . 
ost s, t t i t n t
ip lo . u r t t
t ra io i t, ip flo a t eta-
. t t 1 t  
, . 
99
close to its transition point, then the probability that the flip-flop 
enters a metastable state is relatively high. The flip-flop only enters 
a metastable state if its input is in the vicinity of the transition 
point. If the input is oscillating with a very small amplitude (i.e., 
the input is near the transition point), then the probability of enter­
ing a metastable state is much higher than if the input is oscillating 
between the voltage limits of the circuit. The most effective way to 
minimize the probability of a metastable state is to keep the input to 
the inverter as far away from the transition point as possible. There­
fore, the higher the probability that the output from the last gate in a 
chain is a legal logic value, the lower the probability of metastable 
operation. From Figure 3.15, the probability of a legal logic value 
approaches 1 as the chain length becomes longer. This would seem to 
imply that as the chain length becomes long, the last gate's output 
spends a smaller and smaller percentage of its time in a region near the 
transition point. If this is true, then as the chain becomes long, the 
probability that the flip-flop enters metastable operation approaches 
zero. Unfortunately, in our analysis, we have neglected the fact that a 
gate has a finite slew rate (i.e., the output of a gate can only change 
at some maximum rate) . As the gate output oscillates back and forth 
from one voltage to the other, it takes a finite time to switch from one 
logic level to the other. Therefore, the gate output must spend some 
nonzero time in the region of the transition point.
We have already calculated the switching time for an inverter. In
the section on metastable operation, we estimated that there is window
width of approximately during which time the flip-flop can
ave
 
io i t, abili t ip lo
t etast l iv i . ip flo l t r
etast l t i i i i io
i t, t i in it  all pli ., 
t r ra io i t), abili f t r-
etast l uc r h t il n  
l i it it. ost a
i i i r abili etast l t
rt r rom io i t ssi l . er -
, abili t t t rom t t
i  l i l , o,r r abili etast l
erati , o i r  .  S, abili l i
i e es er . i oul e
l t es , t t ' t t 
all r all r t im r
io i t. n , i es ,  
r babili t ip flo t etast l r t
. nfort natel , r l si , l t t
t  ew . t t t l
t a m t ). t t t il
rom l t r, a im it rom  
e l t r. herefore, t t t ust o
r im ra io i t . 
e rea l i im ert r.  
etast l r t , im t i  
i t f i at l . 1-tsw r hi im ip flo  
 
100
enter a metastable state. When the output is oscillating between the 
voltage limits of the circuit, it should typically be switching at a
speed fairly close to x ^  „ Therefore, when the probability of a
ave
legal logic value is very close to 1 , the probability that the flip-flop 
enters a metastable state should be approximately 0.1. On the other 
hand, when the probability of of a legal logic value is very low, the 
probability that the flip-flop enters a metastable state is quite high. 
Therefore, it is important that the probability of having a legal logic 
value is as high as possible.
In summary, when a component failure occurs in combinational logic, 
three types of illegal logic values may result: timing failures, oscil­
lation, and illegal constant logic values. Synchronization failure may 
also result in an illegal logic value. For the rest of this manuscript, 
we restrict our scope to synchronous sequential systems. These systems 
consist of blocks of combinational logic followed by some type of
clocked bistable elements. If static flip-flops are used, then it has 
been shown that when sensitized to an illegal logic value in the combi­
national logic, these flip-flops may either assume a legal (although 
possibly incorrect) logic value or an illegal logic value (by entering
and remaining in a metastable state) . If the output of one or more
flip-flops assumes an illegal logic value, then these illegal logic
values are presented to the combinational logic block following the 
latches. Although it is possible for illegal logic values to propagate 
through many blocks of combinational logic, it is unlikely since prop­
erly designed flip-flops have a high probability of leaving a metastable 
state well within one clock period. Obviously, the longer the system
 
t r etast l . l' o o t t il in  o
l i it it, l i l i chi:u t  
1:SI herefore, hen t e robabilit  of a 
 
l abili t ip flo
t etast l l .d at l O . . o r
, h abili f l i h o , o
abili t ip-flo t etast l it .
eref re, port t t abili l i
ssi l . 
ary, ponent u r binati al i ,
le l i ln a lt: im . il-
, lle l st t i l s. chronizat a
lt lle l l . r o t anuscript,
t t r enti l s. es e
sist o  co bi at  logic follO'Wod by some  
o i l ents. ip flo a .
ho t i z lle l bi-
t l i , o ip-flo a lllll l lthou
ssi l rr ct lo u  o  an il eg  log c v u  (by t  
ai et st state). If the output of one or  
ip-flo su an il eg  logic value. then those ill gal logic 
binati al i lo-. n
s. l ssi l lle l i at
hro a binati al i , l -
ip-flo abili e etast l
Yell it i r , bviousl . o r tem 
101
clock period is with respect to the combinational delay, the lower the 
probability that an illegal logic value propagates through more than one 
block of combinational logic. If dynamic latches are used, the proba­
bility that an illegal logic value propagates through several combina­
tional blocks is much higher, since dynamic latches do not attempt to 
resolve an illegal logic value to a legal logic value. For this reason, 
static flip-flops are to be preferred over dynamic latches.
Many of the physical failure modes in Chapter 2 result in a gradual 
degradation of switching speed and inverter gain. Such failures include 
hot electron injection, exposure to ionizing radiation, and electromi­
gration. From analyzing the probability of metastable operation and 
illegal constant logic level propagation along a chain of inverters, it 
is clear that the degradation of gate performance has a very negative 
influence on the circuits ability to react to undefined logic values. 
The average length of metastable operation is proportional to the gain- 
bandwidth product. The probability of producing a legal logic value 
from a chain of inverters, is a very strong function of gain and is also 
influenced by the bandwidth of the inverters. As circuits degrade, tim­
ing becomes more critical since all gates in the circuit become slower, 
but not necessarily by the same amount. In addition, the lowering of 
the gain coupled with the decrease in bandwidth makes flip-flops more 
likely to enter a metastable state and more likely to stay in the meta- 
stable state for a longer time.
 
it t binati al l , o er
abili t le l at hro or
f ~bi ati al i . i c . -
i t le l i at h o r l bi a-
io l uc i er, i c t tem t
le l i  l i l . r ,
ip flo i s. 
a o si l ode hapt r lt al
r at i rt i . n u
t ro t , s r i t , ro i-
r t . o abili etast l er t
lle l st t l at o i rt rs,
t r at t orm at
lu it il t ef l s.
e etast l er t ort l i -
i t uct. abili i l i n
rom i rt rs, ro t o i
n luen i t ert rs. s i r e, im-
es or l l t it o o er,
t t essari ount. it , o er
l it i t akes ip-flo or
ik t r etast l or ik et -
o r im . 
102
CHAPTER 4
Concurrent Error Detection of Physical Failures
In the last chapter, we developed an understanding of how circuits 
behave when they fail. The behavior of good circuits which must process 
the outputs of failed circuits was also discussed. We are now in a 
position to develop concurrent error detection schemes for physical 
failures.
4.1. Ffluit.S.
The analysis presented in Chapter 3, demonstrated the diverse ways 
in which a digital circuit may behave when it fails. Most of the clas­
sical faults are only capable of accurately modeling a subset of all 
failures. Physical failures which result in timing failures, oscilla­
tions, or illegal logic levels are very poorly modeled by the classical 
fault models. Synchronization failures also result in circuit behavior 
which is not well modeled by the classical models. These failures all 
result in circuit outputs which cannot be reliably interpreted as either 
a logic 0 or a logic 1 and are hence called indeterminate values.
If all but one input to an AND (or NAND) gate is a logic 1 while 
the remaining input is an indeterminate value, then it is possible for 
an indeterminate value to appear at the output of the AND gate. If,
however, at least one input to an AND gate is a logic 0, then any 
indeterminate input present at any other inputs does not propagate to
 
Yr  
oncurrent r r etect ysi al i  
t pter, er it
he il i r it hi ust
t t e it  . e
sit l rr t t es ysi l
lures. 
J. . . Indetorainate a lts 
l si hapt r , onstrat  
hi i it l it u h i , ost -
l lt l l r t odeli s t l
. ysi l hi lt im . il -
s, lle l i l rl odel l
lt odels. nchronizat lt it avi r 
hi t ell odel i l odels. l
lt it t t hi t a r
i O nni t l es, 
l t t ) t i hil
e ai t i t l . ssi l
n r i t l r t t t . If, 
ever. t t t i o. h
i t t t t t at  
I 
I -
103
the output of the gate. Instead, the output of the gate is a logic 0 . 
Similarly, for an OR (or NOR) gate, if all but one input is a logic 0, 
then an indeterminate value may propagate. If one or more inputs to the 
OR gate is a logic 1, then its output is a logic 1 and the indeterminate 
value does not propagate. An indeterminate value input to an inverter 
may always be propagated to its output.
When an indeterminate input to a gate may be propagated to the gate 
output, we say that the output of the gate is sensitized to the input 
with the indeterminate value. Therefore, AND and NAND gates are sensi­
tized to an indeterminate value when all inputs other than the input (or 
inputs) with an indeterminate value have logic 1 values. OR and NOR 
gates are sensitized to an indeterminate input when all inputs other 
than the input (or inputs) with an indeterminate value have logic 0 
values. Inverters are always sensitized to an indeterminate value.
It is important to realize that simply because a gate is sensitized 
to an indeterminate value which is occurring at one of its inputs does 
not necessarily insure that the gate output is an indeterminate value. 
In this case, the output may be either a legal logic 0, an indeterminate 
value, or a legal logic 1. If the indeterminate input happens to be due 
to oscillation, then the output of a sensitized gate is usually an 
indeterminate value. If the indeterminate input is either an illegal 
constant logic value or a timing failure, then the value assumed by the 
sensitized output depends on such factors as the gate's delay, the gate 
input's transition point, and the noise which is present in the circuit 
at that instant. For this reason, the response of circuits with
 
t t t . , t t t i O
i ilarl , ) t , l t t ,
h i t a at . or t  
t i , t t i i at
t agate. i t t rt r
a t t ut. 
he n i t t t a t t
t ut, t t t f t i z t
it i t l . eref re, t si-
iz n i t l t h t
n t it i at l s.
t si z i t t l t r
h t t ) it n i t i
l s. ert w si z n i t l . 
ort t l t im l t si z
n i t hi rr t
t essari t t t t i t l .
, t t e l n i at  
l , l i . n i t t
il , h t sit al
i t l .  n r i t t le l
st t o l im , h su
si z t t ' s l . t
t' ra io i t, i hi t it
t t. r , it it  
/indeterminate value inputs is, in general, nondeterministic. Effects of 
indeterminate errors may not be repeatable and a signal which is 
fanned-out may be interpreted differently at distinct destinations.
Indeterminate faults are a very general type of fault. Most of 
this chapter is based on the following hypothesis
Hypothesis; An indeterminate value at a node is the most general 
type of single node failure.
This hypothesis is due to the fact that when an indeterminate value 
occurs, it may be subsequently interpreted as either a logic 0, an 
indeterminate value, or a logic 1. Therefore, indeterminate failures 
are able to represent not only the nondeterministic behavior but also 
the deterministic behavior of many classical faults. In this sense, 
stuck-at faults, stuck-open faults, and any other fault which forces a 
node to a legal logic value are only special cases of indeterminate 
failures.
4.1.1. Ternary Algebra
In order to analyze digital systems which operate on indeterminate 
values, it is helpful to have an appropriate algebra. Since such an 
algebra must deal with an alphabet of three distinct values. Boolean 
algebra is clearly inadequate. A ternary algebra [53] however has the 
three required levels. The three values may be represented as {0, u, 1} 
with the property that 0 < u < 1. When a signal undergoes a transition 
from a voltage which is less than to a voltage greater than V1, then 
the ternary algebra models this transition as the sequence 0 -> u -> 1  
[54]. Likewise, a negative transition is modeled by the sequence 1 -> u
104 
n i t t . eral, deter inist . ff t f 
i t a t t l l hi
- t a i t i t st t s. 
i at l l lt. ost f 
i t r low ot esi  
V!>othesi : i at t ost eral 
l . 
i ot esi t he t m t l
urs. 111ay entl e i
n i t l , i . heref re, n i at  
l t t l deter inisti a i r t
t inisti i r a l lt . s ,
- t lt . u lt . t r lt hi
l l i l n i at
lure . 
i -l -l, r  l  
i it l e hi r t i at
l s , l f l r pri t r .
ust l it et i t l es, ool  
n uate.  ] ever
i l . a O, ,
it r t O   . ' he l er ra io
rom l hi h yO l t v1
odel i o O 
]. i e ise. at io odel  
I • 
I . 
105
~> 0. In order for the algebra to be useful, there must be a mapping
between Boolean functions and ternary functions. The ternary functions 
MIN, MAX, and INV may be defined as
Y = MIN[x^ . xn] <. x£ for (1 < i ( n) and Y € (xj, . xn}
Y = MAX[x^j ..., xn] 2 xi for (1 ( i ( n) and Y € {x^, ..., xn)
INV[x.] = 1 -iJ 1 xi
where x^, ..., xn represents the n ternary inputs to the functions and
1 - u is defined to be u. Figure 4.1 gives the truth tables for these 
ternary functions for two inputs. An examination of these functions 
when the inputs are all either 0 or 1 shows that MIN is the ternary 
equivalent of AND, MAX is the ternary equivalent of OR, and INV is the 
ternary equivalent of NOT [53,54].
If MIN, MAX, and INV are substituted for AND, OR, and NOT, then 
many of the laws of Boolean algebra are also valid for ternary algebra. 
These laws include idempotency, commutativity, absorption, associa­
tivity, distributivity, involution, and De Morgan's law [53]. The com­
plementation law, however, does not extend to ternary algebra. That is;
MIN[xl, INV[X1]] £ 0
and
“AXf*!, INVtxi]] ^ 1
The ternary value u will be used to represent two cases. The value 
u indicates either that a signal has an indeterminate value or that the
 
- . r  nst appi  
ool t t s. t
I , X, N a  
  IN x J     1· < )  " {  1, • • •, J:  _ Xf _ _   ' 1, •• , X  
here t t t  
n . i r . r
t ts. inati t
t l O s t I e
i l t ,  i al t , N
e i l t , ). 
I , X, N st , , T,
a a ool l r .
es  a d pote cy, mutativit , r ti , i -
, i ib ty. l t , organ' a .53]. o -
entat , ever, .2ll .ruu r , at : 
 
e n n ill t w s.
n i t t l i at t  
I106
MIN 0 u Li_i
0 0 0 o j
u 0 u a |
1 0 u 1 |
M A X 0 u 1
0 0 u_ 1
u u u 1
1 1 1 1
INV
0 1
u u
1 0
Figure 4.1 Ternary Algebra Truth Tables
i
J(JN O u 
0 0 0 
 O  Q I 
    I 
Y I~ I : I i I 
 1 11111 
I 1~ I 1 I 
I u I o 
I 1 I o 
Figure 4.1 Ternary Algebra Truth Tables . 
106 
I . 
107
value is a usually unknown* (but legal) Boolean value. Therefore, a u 
may represent a logic 0 , an indeterminate value, or a logic 1. This is 
useful since if a gate is sensitized to an input which is an indeter­
minate value, its output may be either an indeterminate value or a legal 
Boolean value. It is therefore possible to use the ternary algebra to 
determine whether or not a gate is sensitized to a particular input with
a given input vector. The gate is replaced with its ternary equivalent
(i.e., a MIN gate replaces an AND gate, a MAX gate replaces an OR gate, 
and an INV gate replaces a NOT gate). The value of the particular input 
is set to u while all other inputs are set to the values given in the 
input vector. If the gate output is u, then the gate is sensitized to
the input. If the gate output is a 0 or 1, then the gate is not sensi­
tized.
The concept of sensitization may be defined for any combinational 
function. A function is sensitized to a particular node (or nodes) of 
the circuit if under a given input vector and an indeterminate value at 
the particular node (or nodes), an indeterminate value may occur at the 
function output. In order to determine if the combinational function is 
sensitized to a particular set of nodes under a given input vector, the 
gates in the function must be first transformed into their ternary 
equivalents. The input vector is then applied to the ternary function. 
Finally, the particular set of nodes are set to the value u. If the 
output of the function is u, the function is sensitized to the set of
*
That is, the value is usually unknown a priori. We discuss in Sec­
tion 4.1.2 why a static hazard may make the Boolean value predictable.
 
l n• t l ool l . herefore,
a t n i t l , i . i
f l t si z n t hi t r-
i at l , t t a n r i t l 
ool l , ssi l n
nn het er t gate is sensitized o a particul r input with 
t t r. gate is replaced with its ternary equiv lent 
., I t t , t t ,
N t . rt l t
t hil ll t t
t t r. t outpu  i  . t  t e gate is sensitized to 
t. t t t is a O or 1, t en the gate is t si-
iz . 
t si io a n binati al
t . o si z  rt l es)
it r t t n i t
rt l es), n i t a r
t t ut. r r i binati al o
si z rt l r t r t ct r,
t o ust ransform i er
i alents. t t h l ct .
i all . rt l t t .
t t o , o si z t  
• at , al ri ri. e -
io . a a ool i t l . 
108
nodes under the given input vector. It should be noted that sensitize- 
tion is always defined with respect to some input vector.
The concept of sensitization developed here for indeterminate logic 
values is analogous to the concept of path sensitization for stuck-at 
faults [55]. A node is said to be path sensitized to an output with 
respect to an input vector if a change in the Boolean logic value at the 
node results in a change in the Boolean logic value at the output. The 
method of Boolean differences [56] can be used to determine whether or 
not an output is path sensitized with respect to a particular node with 
a given input vector. Let y be the output of f, some Boolean function,
and X = ...» be the input vector. Let q be some node in the cir­
cuit which implements f. Then q must be some function of X which we 
shall call g. Therefore, q = g(X) and y = f(X,q), and y is path sensi­
tized to q if and only if
- f<x>0 ) ©  f<X,l)|x = 1
Otherwise, y is not path sensitized to q. The Boolean difference being 
1 implies that, for the given assignment of values to X, any change of 
the Boolean logic value at q results in a change of the Boolean logic 
value at the function's output.
The Boolean difference may be computed for ternary functions as 
well as for Boolean functions. If the inputs to a ternary function are 
Boolean values, the ternary function and a Boolean equivalent of the 
ternary function produce the same result. Therefore, the Boolean 
difference of a ternary function may be computed by finding the Boolean 
difference of the ternary function's Boolean equivalent.
 
r t ct r. l t t sit a-
io it t t ct r. 
o t si o t'lli t i
l l t t si o - t
l ( ].  h t si z t t it
t t t r ool o
lt ol t t ut.
et ool ere otet'111i het er
t t t t i z it ot rt l it
t ct r. et t t f o ol t , 
: X1, ••• , In o t ct r. et 0111.e -
it hi l ents ust o o  hi
ll ll I• herefore, ) , ), si-
iz l  
atc4,a> Ix = fCX,O) ® r x,1) I ;:; 1 
t er ise. t t si z . ol feren
pli t, i1n t o , f
ool i t lt ool i
t ' t ut. 
ol feren a put e t
• l oolo t s. t e n t o o
ool l s, t ool i l t
e o lt. eref re, ool
fere t o a put in n ol
i ere ti ' ool i al t . 
ThgQrem 1: If a node is path sensitized to a second node with a 
given input vector, then it must also be sensitized to an in­
determinate failure at the second node with the same input vec­
tor .
Proof: Consider the ternary model of the circuit® For Boolean 
inputs, the behavior of the ternary model must be identical to 
the behavior of the Boolean equivalent. Let node a be path sen­
sitized to node b for a given input vector. Since node a is 
path sensitized to node b, when the value of node b is set to 0 , 
then node a will assume value d and when the value of node b is 
set to 1 , then node a will assume the value d, where d is 0 or 
1. If the value of node b is set to u, then the circuit may in­
terpret the value of node b as either 0 or 1. Therefore, the 
value of node a may be either 0 or 1 when node b is set to u and 
node a is path sensitized to node b. Consequently, when a node 
is path sensitized to a second node, it must also be sensitized 
to an indeterminate failure at the second node.
4.1.2. The Effects of Hazards on Sensitization
In the last section, it was shown that when the Boolean difference 
of a function is equal to 1 , with respect to some node, the output is 
sensitized to an indeterminate value at that node. The converse of this 
statement is not, however, true. That is, if the Boolean difference of 
a function with respect to some node and input vector is equal to zero, 
then the function may still be sensitized to an indeterminate value on 
the node. As an example, consider Figure 4.2. If inputs B and C to the
109
eo em l  si z it
t ct r, h ust si z -
i t it t -
. 
~ onsi r odel it. r ool
ts, i r odel ust ti l
i r ol i alent. et t -
ize t t r.
t si z , e l i t
ill su l e i
t  ill su a her O 
. n i t , h it a -
t l O . eref re,
l a O l h i t
t si z . onsequentl ,
t si z , ust i z
i t . 
f , .1. ~ ff t SU azar .QJl nsit  
 
t t , as ho t ool
 t al i t o e, nt nt
si z i t n t .
atem t t, ever, . at , ere
t o it t o t t l ,
t o  i z n i t l
e, s ple, si r i r . , t an   
BC
A
0 0 01 11 1 0
0 0
A
1 0
1 0 0 1
C
A
B
Figure 4.2. Example of a Static Hazard.
 
 
 
C 
 0 
1 
Figure 4.2. Example of a Static Hazard. 
110 
I -
Ill
function are both a logic 1 , then the output should be a logic 1 , 
regardless of the value at the A input® Therefore
This Boolean difference implies that the output of function f is not 
path sensitized to input A when B = C ~ 1 . If we consider the 
equivalent ternary function, however, we see that if input A assumes a 
value of u, while inputs B and C both have values of 1, then the output 
is u. Therefore, the output is indeed sensitized to A. This 
discrepancy between path sensitization for a Boolean value and sensiti­
zation for an indeterminate value, is a direct consequence of the fact 
that the complementation law is not valid for the ternary algebra.
By examining the Karnaugh map of the function, it is evident that a 
static hazard [56] exists for a transition of input A while B = C = 1. 
Any time a static hazard exists with respect to an input transition, 
then the Boolean difference with respect to the input must be zero.
A static hazard occurs when there is reconvergent fan-out along two 
or more paths, and at least one of the paths has a different inversion 
parity than the other paths. If x is the input whose transition causes 
a static hazard, then the reconvergent fan-out either results in x + x 
or x • x depending on whether the reconvergence occurs at an OR gate or 
an AND gate respectively. The complementation law of Boolean algebra 
guarantees that x + x ~ 1 and x • x * 0 , which in turn implies that the 
Boolean difference with respect to x must be zero. Therefore, the out­
put is not path sensitized to x.
111 
t o t . t t l i
r l  t. er f r  
fffB=l, c ... 1 = o 
i ool ere pli t t t t  t 
i z t     •  si r
i al t t , ever, t t  u es
n, hil t an  t  , t t 
i  . herefore. t t n si z . hi  
crep t si io ool  siti-
n i t l . i t t
t ple entati a t l ra. 
i i ar a f t , i t t
: io t  hil  ::  , 
im i t it t ll t i ,
ool ere i t t ust . 
r r t t
or t s. t t t i t
ri t s. t os io  
r , r t t lt  
 i het r r t t
..~ t cti l , ple entati  a f ool
ar t t i . : hi ~ l t
ool ere it t ust ~ r . herefore, t-
nt t si z . 
112
If we use the ternary model, then MAX[x, INVEx]] jfe 1 and
MIN[x, INVtx]] £ 0. In particular, if x = u, then MAX[x, INVEx]] =
MINEx, INVEx]] = u. Consequently, the output of the reconvergence gate 
is sensitized to an indeterminate value at x. If the output of the func­
tion is sensitized to one or more of the outputs of the reconvergence 
gate,* then the function is sensitized to an indeterminate value at 
input x. By modifying a function, it is always possible to remove a 
static hazard. In the example of Figure 4.2, the static hazard may be 
removed by adding the product term B • C to the sum-of-products imple­
mentation. This term is redundant but serves to remove the static 
hazard by desensitizing the reconvergent fan-out from the output of the 
function. When B = C = 1, the term B • C is 1. The reconvergence gate
is an OR gate. An input of 1 from this product term thus forces the OR
gate output to a 1 value. This new implementation of the function no 
longer, has its output sensitized to an indeterminate value at input A 
when B = C = 1.
From this analysis, it is clear that a static hazard implies sen­
sitization. This fact is not surprising since ternary algebra has long 
been used to detect the presence of hazards in digital circuits E54, 
57] .
Eichelberger [54] has extended the concept of static hazards to 
hazards that occur during multiple input transitions. He calls a hazard 
due to a transition at p of the inputs, a u-variable logic hazard. Let
*
It is possible for the reconvergence gate to be the same gate as the 
output gate.
 
 odel , then MAX[:a:, INV[x]] ;' 1 and 
I [x, [ F O. In particular, if   , :a: [x]] • 
IN[x, [x]] . onsequentl , t t t
si z i t . t t -
io  he or t t
t .• t o si z n ter1Dinat t
t , odif i t , ssi l e
r . pl i r . , a
e i ct er11 -of- r uct pl -
entati . i erm t t e  
s nsit r t t rom t t
t . he   = . t       . t  
1 te. in  from ct erm  
t t t l . i pl entati t o
er . t t si z i at t t  
he   := . 
o alysh, t pli -
iz . i t t r o
t t i it l it ( ,
]  
i el r r { ] t
r t r r ulti l t ra io . ll
ra io ts, ~ l c et 
• ssi l ec t t
t t  
113
X1 = (X1* • • • ' 3Cp» Xp+l , • , Xn)
and
2“(*1 ' ...» Xp, Xp+i, . xn)
where and x2 represent input vectors of some combinational Boolean
function f. Function f is said to have a p—variable logic hazard for a
transition from input vector to fnpUt vector X2 (p variables change 
in this transition) if and only if
(1) f(x^) = f(X2)>
(2) all of the 2P values specified for f in the sub-cube 
X^p+1» •••» xn) are the same, and
(3) during the input change from to X2 , a spurious pulse may 
1be present at the output.
For the special case of p = 1, the p—variable logic hazard is identical 
to a static hazard. Eichelberger shows that by modifying the implemen­
tation of a function, it is possible to remove all p—variable logic 
hazards. In many cases, this will require the addition of redundant 
logic just as it did for static hazards.
From the above definition of a p-variable logic hazard, it is 
apparent that whenever a p-variable logic hazard exists, the output of 
the function is sensitized to indeterminate values at the p variables. 
This sensitization occurs despite the fact that all 2P values of f in
the sub-cube (xp+^  ..., xn) are the same. Clearly, if these values are 
not the same, the function must be sensitized.
 
 
X -=(::r:1. •··• ::r:p• ::r:p+l• • ··• ::r:n) 
er 1t and X2 represent input vectors of some combinational B olean 
o nct  - i l  
io rom t t r X1 i u t r x2 i l
ra io l  
(2) ll of 
( ::r: l• • • I I 
n i
::r:n)   e,  
  
t rom X1 x2 l ab t t t ut. 
r i l , - r l i ti l
r . i el r r t odif i -
io t , ssi l e l - r l i
ar s. a s s, ill i i o t
i t ar s. 
o fi i o i l i r ,
r t t henever i l i ::r:ists, t
t si z i t n ri l s.
b.is sit o r it t t l l    
l• •• • , ::r:n e. l arl ,
t e, t o ust sit . 
114
4.2. CflflfiS J t o rqs  Detection
The goal of concurrent error detection is to detect errors during 
the normal operation of the system. Ideally, the concurrent error 
detection scheme should guarantee the detection of all possible errors. 
A class of circuits has been defined [5] that under a number of assump­
tions achieves the goal of total error detection. These circuits are 
designed so that under the proper assumptions, any error results in a 
non-codeword output from the circuit and the first incorrect output is a 
non-codeword output. Such circuits are called totally self-checking 
circuits.
4.2.1. Totally Sgif-Chflgfcias Circuits
Let the input code space be all input vectors that can be applied 
to a circuit under normal (i.e., fault-free) operation. All output vec­
tors not in the code space are non-codewords. The following definitions 
are paraphrased from [5]:
Self-Testing: A circuit is said to be self-testing if for every 
fault in the fault model, there is at least one sequence of 
codeword inputs which produces a non-codeword output.
Fault-Secure: A circuit is said to be fault-secure if for every 
fault in the fault model, the circuit either produces the 
correct output or a non-codeword output for the entire input 
code space.
Totally Self-Checking: A circuit is said to be totally self­
checking if it is self-testing and fault-secure.
Code Pisioint: A circuit is said to be code disjoint if all 
non-codeword inputs produce non-codeword outputs.
 
!-l• oncurrent~ etect  
l curr t t t t t r
al er t e . ll , rr t
t che l t t t l ssi l rs.
 it t r lllllbor u -
io i l l t t . es i
t er o r pti s, l
r t t rom it o rr t t t
r t ut,
i  
i l f-che
!,1 . ot l ~-Checkin1 i it  
et t l t t r t l
it r al . l erati . ll t t -
t - ords. low fi i o
r rom IS]: 
~-Testina: it e in
l lt odel, t t
r t hi r t ut. 
Eull-Secu :  it -sec
lt l odel. it o
r t t t r t t o ti t 
. 
t  11 v  -C n1: d it u 1 -
o i e in l s r . 
Dh ; :  it h  i t ll
r t r t uts. 
I -
115
The following assumptions are made about the operation of a totally 
self-checking circuit:
(1) Only failures which are modeled by the fault model occur.
(2) Failures occur one at a time with some minimum time inter­
val, r, between each failure.
(3) The inputs to the circuit are applied often enough to insure 
that during any time period of length t , enough inputs are ap­
plied to the circuits to test the circuit completely. This as­
sumption is referred to as the testabilitv assumption.
In practice, the period of time between failures is a random variable 
and is often modeled as a Poisson process. It is possible for two 
failures to occur in a period of time much less than x although this is 
unlikely. If the circuit is completely tested in any time period x ,  
then r can be made sufficiently small to insure that the probability 
that a second failure occurs before the first failure is detected, is 
low. Note that in the testability assumption, completely testing the 
circuit, refers only to testing for those faults which are testable.
Under these assumptions, a totally self-checking circuit is able to 
detect any failure. This fact is guaranteed by the self-testing pro­
perty and the three assumptions. The self-testing property is necessary 
to prevent the buildup of undetectable latent faults. The fault-secure 
property, assures that for any fault from the fault model, all incorrect 
outputs are non-codewords. We are therefore assured of meeting the goal 
that the first incorrect output is a non-codeword. This property is 
referred to as the totally self-checking goal [58]. The totally self­
checking property is more restrictive than it needs to be since there
 
low pti s a t r t
f-che it: 
nl u hi odel l odel ur. 
 ) i r t im it  i im im -
al,~. . 
t it l
t r im e T t -
it t it pletel . i -
pti il y pt  
ti , r f im ando r l
odel i ess, ssi l
u r r im uc T thou
li l . it pletel  e im T,
h T a f t all t abili
t lu r lu t t ,
o . ot t il pti . pletel in
it, l  in l hi l . 
nder pti ns, f-che it l
t t . i t r t e in -
r pti ns. e in ert
t i d et ct l t lt . -se
ert , t lt rom lt odel, l rr t
t t - ords, e eeti al
t r t t t ord. i er
f-che l ].  l -
ert or iv   
116
are circuits which are not totally self-checking but which still satisfy 
the totally self-checking goal. Consider a sequence of faults from the 
fault model. As each subsequent fault in the sequence occurs* the 
behavior of the circuit is modified to reflect the effects of all the 
faults in the sequence. A sequence of faults is said to be detectable 
if at least one codeword input produces an incorrect result. The fol­
lowing definition is paraphrased from [58] :
Strongly Fault-Secure: A circuit is said to be strongly fault- 
secure if for all possible sequences of faults* as each fault in 
the sequence occurs* the first fault in the sequence which 
causes the sequence to be detectable only produces correct out­
puts or non-codeword outputs.
Strongly fault-secure networks are the largest class of networks which 
achieve the totally self-checking goal [58]. Totally self-checking net­
works are a subset of strongly fault-secure networks. The fault secure 
property is a necessary (but not sufficient) condition for a circuit to 
be strongly fault-secure. If a circuit is strongly fault-secure* then 
each failure that occurs must either be detectable or transform the cir­
cuit into a new circuit which is also fault-secure until a detectable 
failure occurs. Totally self-checking and strongly fault-secure cir­
cuits are generically referred to as totally self-checking. A distinc­
tion between the two will only be made when it is relevant to the dis­
cussion.
Typically* the totally self-checking properties are only considered 
for combinational circuits. This restriction is made to insure that the 
testability assumption is met. If the circuit is combinational* then by 
applying the input code space to the circuit, all stuck-at faults which
 
it hi t f-che t hi
f-che al. nsi er l rom o
l odel. s t l nrs.
avi r it 111.odifie t t l
l ce. l t l
t  r t r t lt. -
o i fi i o rom )  
l l -S e  it ro lt
l ssi l lt . l
o urs, l hi
t l l r t t-
t r t uts. 
t l l -se r o t r hi
f-cho t n1 l ]. otal f-che t-
or s s t ro -se orks. lt
ert t t f t dit it
ro l s r . • it ro l s r .
t r ust t l ransform i -
it it hi -sec til t l
lu rs. otall  I -ch ro -sec -
i  eri l f-ch e i .  -
io o ill l a e e t -
ssi . 
ypicall , f-chec erti l si
binati al it . i io a t
il ss11211ptio11 net. it binational,
l t it. l - t l hi  
I -
117
are detectable will be detected. If tbe circuit is sequential, then a 
specific sequence of input codewords must be applied in order to assure 
the detection of all detectable faults. Therefore, for a combinational 
circuit, it is only necessary during any time period, x , to apply at 
most the entire input code space to the circuit to satisfy the testabil­
ity assumption. Since the inputs to a circuit are in general unknown, 
it is very difficult to insure that the testability assumption is satis­
fied during normal operation. The most obvious solution is to use 
periodic off-line testing to test the circuit completely. If off-line 
testing is used, then sequential circuits can be tested by performing 
the tests in the proper sequence to test all faults.* We have worded our 
definitions of the self-testing property so that it may apply to sequen­
tial as well as combinational logic. If the self—testing property is 
only specified for combinational logic, then any fault which causes 
sequential behavior, automatically prevents the circuit from satisfying 
the self—testing property. For this reason, the self—testing property 
is defined for both sequential and combinational logic even though the 
circuits we consider are combinational. If the combinational circuit 
has some of its outputs fed back as inputs, it may be analyzed as a com­
binational circuit for the purpose of determining whether it satisfies 
the self-checking property.
Figure 4.3 shows a typical totally self-checking module. The 
module is made up of a totally self-checking circuit which performs the
*
Generating tests for sequential circuits is significantly more dif­
ficult than for combinational circuits.
 
t l ill t t , h it ential,
cif t or s ust l
t l t t l lt . herefore, binati al
i it, l r im ri , 't l t
ost t t o it il-
lll!lption,  t it er l n.
i lt t il pti ti -
e r al erati . ost i
r i f in t it pletel . f
in , enti l it or i
r t l lt .• e or r
fi i o e in ert t a l -
el I binati al i . l -te in er
l i binati nal i , l hi
enti l avior, aticall t it rom t i
l - in ert . r , l - in ert
n t ential binati al i h
it si r binational. binati al it
o t t ts,  -
i t l it i het er  
i ert . 
i r . s i l l c odule.
odul a f-ch i it hi s  
• enerat enti l i i t or i -
lt binati al it . 
118
INPUT
VECTOR
OUTPUT
VECTOR
Figure 4.3. Totally Self-Checking Module
 
 
TOTALLY SELF-CHECKING 
DATA PROCESSING 
CIRCUITS 
TOTALLY SELf--<:HECK ING 
CHECKER 
ERROR I ND I CA Tl ON 
r  . . t l l hec i odul . 
 
 
 
119
desired data processing. The inputs and outputs must be encoded in an 
appropriate codes. The outputs from the circuit are examined by a 
totally self-checking checker. A totally self-checking checker must 
itself be both totally self-checking and code disjoint. The code dis­
joint property is required since the whole purpose of the checker is to 
indicate when it receives a non-codeword input from the data processing 
circuit. The checker does this by producing a non-codeword on the error 
indication lines. The cade disjoint property assures that if a non­
codeword is produced by the processing circuit, then the checker indi­
cates the fact by producing a non-codeword. The checker must be totally 
self-checking to insure that any failure in the checker is detected 
before it can cause the checker to miss detecting a non-codeword output 
from the processing circuit. The checker is therefore able to detect 
faults in the data processing circuit as well as in itself. The error 
indication lines are usually encoded using a l-out-of-2 code. A minimum 
of two lines are required for the error indication. This requirement 
prevents the failure of one checker output line from causing the error 
indication to appear permanently good.
In many cases, it is advantageous to build a totally self-checking 
system by connecting together several smaller totally self-checking cir­
cuits. If all circuits have their output checked by totally self­
checking checkers, there are no additional restrictions which are neces­
sary. If it is desired to connect two circuits together without check­
ing the output from the first circuit, then the second circuit must be 
code disjoint. The code disjoint property assures that if a non­
codeword output is produced by the first circuit, then the second
 
t cessi g. t t t ust  
r pri t es. t t
f-che er .
rom it i
f-che r ust
t f-ch i i t. -
t ert i o hol r
i t r t rom t  
it. r i  r r 
o . O' t ert t -
or li it, r i-
t t i or . er ust
f-che t er
n r is t n r t t
rom it. r l t t
l t it ell l .
in al 1 t . i im
in i i t . i re t
t r t t in rom
r anentl d. 
a s, t i f-che  
em ect t r r l all r f-che -
it . l it i t t l -
i kers, it l c io hi -
. ect it t r it t -
t t rom it, it ust
i i t. i t ert t -
or t t it,  
120
circuit also produces a non-codeword output. The non-codeword from the 
second totally self-checking circuit is detected by the checker and thus 
the fault in the first circuit is detected.
The definitions given above* are the traditional definitions for 
totally self-checking systems. These definitions are quite adequate 
when traditional fault models are used. When failures cause indeter­
minate values to occur* the circuit behavior is no longer deterministic. 
If the output of a circuit with a given fault and input vector could be 
one of several different output vectors (some of which may be codewords 
and some of which may be non-codewords)* then the self-testing property 
is not satisfied. Therefore, if the traditional definitions of totally 
self-checking were retained* it would not be possible to construct 
totally self-checking systems which include indeterminate value faults 
in their fault model.
In order to allow the construction of totally self-checking cir­
cuits for fault models allowing indeterminate value faults, new defini­
tions are required. We, therefore* propose the following
Potential Codeword: Let A be a ternary logic vector containing i 
elements assigned the value u. It is possible to construct 21 
distinct Boolean vectors by replacing all u values with a logic 
0 value or a logic 1 value in all possible combinations. Vector 
A is said to be a potential codeword if exactly one of the 21 
Boolean vectors is a codeword. The Boolean vector which is a 
codeword is called the corresponding codeword of the potential 
codeword. (Any vector which is neither a codeword nor a poten­
tial codeword is said to be a non-codeword.)
Self-Testing: A circuit is said to be self-testing if for every 
fault in the fault model* there is at least one sequence of 
codeword inputs which produces either a non-codeword output or a 
potential codeword output.
 
it a r t t r rom
f-che it • t o er
lt it t t . 
fi i o J v e. io l fi i io
f-che s. o o   ioa. i at
he io l lt odel . he lu t r-
inat l r, it avi r s t inisti .
t t it it s v lt t t l
r l i t t t t r { o f hi a  or s
0111e hi a - e ords). e in ert
t t . eref re. ra io l fi i io
f-che or . oul t ssi l st t
f-che e hi u i t l l
i l odel, 
lo st  f-che -
i l odel lo n i t lt , fi i-
io i . e, r , low  
tenti l y : et  ern t r t
e ent  uign. , h ssi l st t i
i t ool t r l l it
i l ssi l binati ns, ect r
 t nti l r :n t i
ool  t r ord. ool t r hi
or l esp a or t nti l
ord, t r hi i r  r r t -
r or . ) 
b l- ina  it in
l lt a eL t
or  t hi r t t
t nti l r t ut. 
121
Fault-Secure: A circuit is said to be fault-secure if for every 
fault in the fault model# the circuit either produces the 
correct output# a non-codeword output# or a potential codeword 
output whose corresponding codeword is the correct output for 
the entire input code space.
Totally Self-Cheeking: A circuit is said to be totally self­
checking if it is self-testing and fault secure.
Strongly Fault-Secure: A circuit is said to be strongly fault- 
secure if for all possible sequences of faults from the fault 
model, as each fault in the sequence occurs# the first fault in 
the sequence which causes the sequence to be detectable# only 
produces the correct output, a non-codeword output# or a poten­
tial codeword output whose corresponding codeword is the correct 
output for the entire input code space.
The new definitions explicitly allow for the presence of indeterminate 
values in vectors. In the remainder of this thesis we use these defini­
tions rather than the traditional definitions.
4.2.2. Checker Strategy
Checkers designed for traditional types of faults are only designed 
to work properly with legal logic values. In general# digital circuits 
are unable to react in a reliable manner to indeterminate values (i.e., 
the circuit response to indeterminate values is nondeterministic). A 
variety of checker strategies is possible when indeterminate values may 
occur in output vectors. One strategy is to include additional circui­
try in the checker portion of the circuit. The purpose of this addi­
tional circuitry (which we refer to as indeterminate detection circui- 
trv) , is to detect the occurrence of indeterminate logic values in the 
output vector. If an error is present in the output vector# then either 
the original checker and/or the additional indeterminate detection cir­
ful -Secure:  it -se  
lt lt odel, it
r t t ut, r t ut. t nti l r
t t hos esp r r t t t
ti t e. 
otall kl.f.-Ch c ina  it l -
e in l r . 
ron&lv l S :  it ro lt
 r l ssi l l rom lt
odel, lt urs, l
hi t t l , l
s r t t ut, r t ut, t -
r t t hos esp r r t
t t ti t e. 
 
fi i io r li i low i at
l ct rs. ai er i fi i-
io h io l fi i s. 
i -1 •1 • ec er t e v 
heckers ra io l y l l
or r i l l es. eral, it l it
l t anner i t .
it i t rm ni ic
ri er e ssi l n i t l  
r t t ct rs. ra e n u it l i-
 er rt it . i-
io l i i i t t t i
..t.Ir , t t r  i t l s
t t ct r. t t t r.
i l r / r it l n i t t -
122
cuitry detects it. The original checker circuitry detects any erroneous 
bits in the output vectors which are incorrect, but legal, logic values. 
The checker circuitry may or may not detect the presence of any indeter­
minate values. The indeterminate detection circuitry is designed to 
detect the occurrence of indeterminate values in the output vector. 
Therefore, the checker together with the indeterminate detection circui­
try is able to defect all erroneous output vectors.
There are several problems with this strategy. First of all, 
indeterminate values may be of several different forms. Circuits which 
are capable of detecting all types of indeterminate values are 
inherently quite complex. To make the problem even more difficult, the 
indeterminate detection circuitry is inherently analog. The circuitry 
must be capable of measuring voltage amplitudes and determining accu­
rately when high frequency transitions on one line occur in relationship 
to another line (most likely the clock). To fabricate such circuits 
with an integrated circuit process that is optimized for digital cir­
cuits, further complicates this problem.
The most serious problem with the indeterminate detection circuitry 
is the need to make it part of a totally self-checking system. The con­
cept of totally self-checking is defined for digital systems where it is 
meaningful to discuss the encoding of input and output vectors. There­
fore, it is doubtful that indeterminate detection circuitry can be built 
which is totally self-checking.
At the very least, is should be possible to test the indeterminate 
detection circuitry under normal operation. Testing of analog circuitry
 
i t t l er i t t ro
i t t t r hi rr ct, t > i l s,
r i a a t t t ho t r-
i at l s. l t a t o i
t t r n i t l t t ct r.
herefore, o r t r it o n i t t i-
l t t l ro t t ct rs. 
er r l e s it e . ir t ll,
i at l a r l i t s. ir it hi
l t l i t l
n t i aples. a e or i lt,
n i at t i n t l . i
ust l nu,asuri l pli i -
requ a io in r ion
t r in st ik lu1 doc it
i it t i i it l i -
it . plicat . 
ost e i n r i t t i
a rt f-che . -
t f-che i i l e her
eaningful i t t t ct rs. er -
. btf l t n i t t i ilt
hi f-ch i . 
t ou . l ssi l t n i t
t i 'llltder al er t . est i  
123
is considerably more complicated and inherently different from testing 
digital circuitry [59]. It is altogether unclear how to go about test­
ing analog circuits as complicated as the indeterminate detection circu­
itry during normal system operation of a digital circuit. Unless a 
scheme can be developed to test the indeterminate detection circuitry, 
the overall system reliability is seriously compromised since it is now 
possible for a series of failures to lead to an undetected error. 
Therefore, cost and reliability concerns make the strategy of directly 
detecting indeterminate values unattractive.
An alternative that eliminates the cost objection is possible if 
all lines which are to be checked by the checker come directly from the 
output of clocked flip-flops. Recall that the output of a flip-flop is 
either a a legal (but possibly incorrect) logic value from the flip-flop 
or the flip-flop is in a metastable state. Therefore, the problem of 
detecting indeterminate values has been reduced to the problem of 
detecting metastable operation. As discussed in Section 3.4.1, circuits 
capable of detecting metastable states do exist. A circuit given by 
Stucki and Cox [47] is shown in Figure 4.4. This circuit is an 
exclusive NOR gate and is intended for implementation using MOSFETs. 
The true and complemented outputs from the flip-flop are the inputs to 
this circuit. The MOSFET exclusive NOR gate is being used as an analog 
comparator to compare the voltage difference between the flip-flop's Q 
and d outputs. When a metastable condition occurs in the flip-flop, the 
true and complemented outputs are at approximately the same voltage. An 
examination of Figure 4.4 shows that any time IVq _ y^| < V (y of
the two enhancement mode transistors), the output is high. An
 
si l or pli t n t i t rom in
it l i 1. to r cl r t -
it pli t n i t t t -
r r al tem er t i it l it. nl s
che n er i t t t it ,
erall tem il pr is n
ssi l u et t r.
herefore. st il r a rateg t
t i t att ti . 
e t i i t st j ssi l
l in hi er t rom
t t o ip lo . ecall t t t ip flo
l t ssi l rr t i rom ip flo
ip flo etast l . herefore, em
t t i at l em
t n etast l erati . s t . . . it
l t n etast l ist.
t i ] o r . .
 it
i it
l t en l entati SFETs
r pl ent t t rom ip-flo t
it. lf l t
parat r par l ere ip '
Q t uts. he etast l dit r ip lo ,
pl ent t t t at l lt .
inati i r  t  im  /v0 - vo:I i th Vth f 
ent od i t rs), h t t .  
Figure 4.4. Metastable Detection Circuit.
ll4 
Q 
Q 
i r  . . etast l et ct ircuit. 
125
indeterminate detection circuit can be built using the exclusive NOR 
circuit and a circuit which samples the exclusive NOR's output at the 
appropriate time in relation to the system's clock.
This circuit still suffers from the same reliability issues which 
were raised earlier for the analog indeterminate detection circuitry. 
There is simply no way to test the operation of the exclusive NOR cir­
cuits during normal system operation. An additional problem with this 
circuit is that it only detects indeterminate values as long as the 
flip-flop is operating properly. A failure in the flip-flop could also 
render the exclusive NOR circuit unable to detect indeterminate values 
reliably.
An alternative checker strategy is to make no attempt to detect 
indeterminate values. Instead, the assumption is made that all indeter­
minate values will eventually become legal logic values as they pro­
pagate through the system. Errors are detected when they finally mani­
fest themselves as incorrect, but legal, logic values. It is also pos­
sible for all indeterminate values all to become correct and legal logic 
values. In this case, no error is indicated and the system has produced 
the correct output.
The success of this strategy is dependent on the assumption that 
indeterminate values eventually become legal logic values. For systems 
which are constructed with blocks of combinational circuitry sandwiched 
between clocked flip-flops, this is a very reasonable assumption. Our 
analysis in Section 3.4 showed that under normal circumstances, the pro­
bability that the output of a flip-flop is other than a legal logic
 
i at t t it ilt l
it it hi pl l R's t t
r pri t im io ' . 
i it rom i hi
• rl i at t o it .
er im l er t l -
it r al tem erati . it l e it
it t l t t i t o
ip-flo er t erl . ip-flo l
l it l t t n i t l
l . 
e er rateg a e te t t t
n i t l es. , s'Dlllpti a t l t r-
i at ill t al l i -
at hro e . rr r t h ani-
h sel rr ct, t al, i l s. l -
l n i t l l r t l i
l s. , tem
r t t ut. 
rateg e t pti  t
i t t all l l s. r e s 
hi st it binati al i and
o ip lo , l pti . ur
l si t . t r al rcu st ces, -
bili t t t ip flo r h l i  
126
value is low. If an indeterminate value from one block of combinational 
logic is presented to a clocked flip-flop, tbe probability of the 
indeterminate value propagating into the next block is therefore low. 
The probability that it propagates through another clocked flip-flop 
into a following combinational logic block is even lower.
If the failure occurs in the flip-flop itself, an indeterminate 
value may be produced but subsequent flip-flops should eventually 
prevent continued propagation of the indeterminate value throughout the 
system. The major shortcoming of this assumption is the possibility of 
indeterminate values being generated in the proximity of the system out­
puts. If these system outputs are connected to another system which is 
designed to be tolerant of indeterminate valde inputs, then the other 
system is able to respond in some appropriate manner to the indeter­
minate values. Otherwise, a serious failure can occur. In general, the 
only solution to this problem is to make the other systems fault- 
tolerant with respect to indeterminate values since a failure in the 
lines which connects the two systems may also result in indeterminate 
values.
The strategy we use is the second one (i.e*» no attempt is made to 
detect indeterminate logic values). This strategy does not require 
costly analog detection circuitry. It also does not result in an unte- 
stable design. The checker circuitry required for this strategy is 
entirely digital.
It should be pointed out that both strategies suffer from a testing 
problem. The definition of self-testing requires that a faulty circuit
 
. n i at rom binati al
i o ip lo , h abili
n i t at t h o .
abili t at hrou t r o ip-flo
low binati al er. 
.l r ip flo l , n i t
l a t t lip-flo nl t all
t t at n i t hro t
e . aj r tco i ■ pti ssibili
i at l r t it tem t-
ts. em t t t t tem hi
t n i t hf ts, h r
tem l sp 101110 r pri t anner t r-
i at l es. t er ise. ur. eral,
  i a r e lt
t it t n i t n lu
in hi ect ,r e a  lt i at
l s. 
rateg  .• te t a
t t i t l . i rateg t i
st t t it . t lt t -
si , r i i rateg
t i it l. 
l i t t t a e rom in
, fi i o e in i t it 
127
must produce either a non-codeword or a potential codeword output for 
some sequence of codeword inputs. If the faulty circuit produces a 
non-codeword output, then the fault is detected. If however, the fault 
produces a potential codeword output, then the fault may or may not be 
detected. There are three cases that must be considered. One case 
occurs if the potential codeword's u values are all interpreted as legal 
Boolean values which also happen to be correct. In this case, neither 
strategy would detect the fault. Another case occurs if at least one of 
the potential codeword's u values is an indeterminate value. All u 
values which are not indeterminate values must be legal and correct 
logic values. In this case, the strategy which relies on indeterminate 
detection circuitry detects the fault but the strategy we use may not 
detect the fault. Finally, there is the case where at least one of the 
potential codeword's u values is an incorrect Boolean value. In this 
case, both strategies detect the fault.
Therefore, testing is a serious concern. Regardless of which stra­
tegy is used, there is no assurance that a given sequence detects a 
fault that produces indeterminate values. One of the assumptions that 
was made earlier was that there is some minimum time interval, x , during 
which the circuit is completely tested. In order to prevent the buildup 
of latent faults in this case, the time interval, t , may have to be 
reduced considerably.
In many ways, the testing problem for indeterminate failures is 
quite similar to the testing problem for intermittent failures. In 
fact, most indeterminate faults can alternatively be considered to be
 
ust r t nti l e- or t t
  r  ts. l it
r t ut, l t t , ever, lt
t nti l or t ut, l a r a t
t t . er t ust si er .
r t nti l ord' n l ll l
ol l hi rr ct. , i r
rateg oul t t lt. not er r t t
t nti l ord' l n i t l . ll
hi t n i t l ust l rr t
l s. , rateg hi n i t
t o i t t lt t rateg a t
t t lt. i all , her t t
t nti l r ' s n r t ool l e,
, t e t t lt. 
i
eref re, in o cer . egardl hi -
e , t t t
l t n i at l s. pti t
as a rl as t  i im im r al.~. r
hi it plet l . r t i d
 t l . im l, 't'
e si r l . 
a ays, in e i t
it il e  itt t .
t, ost nd  nni t l er l  
128
intermittent. Depending on how the circuit responds to an indeterminate 
fault, an error may or may not be produced when one or more outputs are 
sensitized to the fault. Therefore, the error produced by an indeter­
minate fault may certainly be viewed as being intermittent. Techniques 
for the detection of intermittent failures are discussed in [60, 61]. 
These techniques are intended for off-line testing. Nevertheless, for 
those situations where off-line testing is used to help satisfy the tes­
tability assumption, these techniques could be used to increase the 
off-line testing effectiveness and/or reduce the off-line test length.
4.3. CED andgj S. Sim&liiiM  Indeterminate Fault Model
The properties of indeterminate faults have been established. In 
addition, we have established the conditions which we require of our 
systems in order for them to implement concurrent error detection. We 
are now ready to propose a fault model that incorporates indeterminate- 
type faults.
4.3.1. Fault Mp.d.Sl As$pmpti<?fl£
The simplified indeterminate fault model assumes that any physical 
failure causes a single node in the circuit to become a ternary u value. 
This fault model excludes some (but not all) bridging-type failures. 
Opr analysis of Chapter 3 shows that, in general, each line which is 
shorted to another line may have an indeterminate value. Thus, if we 
wish to model the most general case, then each line which is shorted to 
another line has a ternary u value on it. In a few cases, it may be 
possible to model two lines shorted together with only a single ternary
 
itt t. e e di it n i at
lt, a a t ~ or t t
i z lt. herefore, t -
i at lt a e itt t. ni
t t it t u bl , ]. 
es e n en . evertheless,
tu io her f n in -
i pti , e l
in iv / r t . 
i-1• W lUlill • i plified t i t  :E .ll 2.d.tl 
erti n i t l s .
dit , ,r sh dit hi i r
e hem l ent curr t t t . e
l odel t r t n i at -
lt . 
i -l•l• ault~ asumptions 
nli o n i t ult~ su t si l
lu l it l .
i l odel o t t .
ur l si hapt r s t, eral, in hi
r t r n a n i t l . b.tu,
i odel ost er l , in hi r
t r 1 e ew s, a
ssi l odel w in r it l  
129
u value. Figure 4.5 shows examples of two different bridging faults. 
Fault 1 is disallowed by the simplified indeterminate fault model since 
lines X and Y must both be considered to have u values on them. Fault 2 
is allowed by the simplified indeterminate fault model since indeter­
minate values on both inputs X and Y are indistinguishable from an 
indeterminate value at the output of the gate. Therefore, fault 2 may 
be modeled as a u value on line A. This fault model does not consider 
the effect of failures on certain global signals such as ground, power, 
and clocks. Such failures may affect the entire circuit or very large 
sections of it. If protection must be provided against failure of these 
global lines, then the circuit must be designed so that a global line 
failure results in a non—codeword. This type of design usually requires 
at least a redundant copy of each such global signal.
From Theorem 1, stuck-at fault (or any other type of failure which 
causes a single line to become a legal logic value) propagation automat­
ically is considered by this fault model. It should be pointed out that 
using the ternary u value for legal logic values may result in mislead­
ing results if there are hazards in the circuit. In the presence of 
hazards, the ternary model may predict that an output or outputs are 
sensitized to a value of u at a node. If the value of u is a legal 
logic value and the Boolean difference with respect to the node is zero, 
then the path is not sensitized. Therefore, there are cases where an 
indeterminate value propagate even though a legal Boolean value does not
propagate.
 
l . i r . s pl t lt .
anlt llow pli n i t lt odel
in   ust t si he . ult
low pli n i at  lt odel n t r-
i at l t t   n ing l rom
n i t l t t t . herefore, n  t a
odel n n . i l odel t si r
t u r l l , er,
s. t ti it
r t t ust i st
al , it ust t l n
lt -code i al i s
t t t l al. 
om e , - t lt r hi
l n l l at at-
s l odel. l i t t t
ern l i l a lt isl -
lt it.
ar s, e n odel a i t t t t t t
si z t e. l l
o ool ere it t ,
t t sit . er f r , her
i t l at h l ool t 
e  
Figure 4.5. Two Types of Bridging Faults.
X 
z 
X 
y 
X 
z 
X 
y 
130 
FAULT2-> >A 
~L------
<- FAUL T 1 
B 
i S . es ri s ults. 
131
When determining whether a circuit satisfies the fault-secure pro­
perty, we are interested in which faults are sensitized to the outputs 
for a given input. If a fault is propagated to an output, the code used 
in the circuit must be able to detect this fault. If a fault always 
results in a legal logic value, then the ternary model may predict that 
the output is sensitized, even though the output is not path-sensitized 
to the fault. Legal logic values may only propagate to an output when 
the output is path sensitized to the fault. Therefore, the ternary 
model is pessimistic for legal logic values when determining whether or 
not a circuit satisfies the fault-secure property.
On the other hand, when determining whether or not a circuit satis­
fies the self-testing property9, it is desirable to propagate as many 
faults as possible to the outputs as this is the only way in which a 
fault may be detected. Consequently, the ternary model is optimistic 
for legal logic values when determining whether or not a circuit satis­
fies the self-testing property. For this reason, the indeterminate 
fault model is a poor choice if failures cause only legal logic values. 
On the other hand, in situations where both indeterminate values and 
legal logic values are caused by failures, then the indeterminate fault 
model is a good choice, since it handles both indeterminate and legal 
logic values. In situations where failures may cause both indeterminate 
values and legal logic values, the simplified indeterminate fault model 
should be used to determine whether or not a circuit satisfies the 
fault-secure property. However, the single stuck-at fault model should 
be used to determine whether or not a circuit satisfies the self-testing
property.
 
he r i het er it t -se -
rt , hi l sithe t t
t. lt t t ut,
it ust l t t lt. l s
lt l i l . odel a i t t
t t sit . h t t t t - si
lt. al i a l at t t h
t t t si z lt. heref re,
odel i isti l i i het er
t it t -se ert . 
, r i het er t it t -
e in r • i l at a
l ssi l t t l hi
lt a   t t , onsequently. odel t isti
l r i het er t it t -
in ert . r , n i at
lt odel r i u l l i l s.
, tu io her t n i t l
l i l . n i t l
odel i , n l t n i t l
i l s. tu io her u a t i at
l l i l s. pli i t lt odel
l i 1'het er t it t
-se ert . owever, l - t l odel nl
r i het er t it e in  
ert . 
132
In order to study the implications of using the simplified indeter­
minate fault model, we assume that all indeterminate values become legal 
logic values by the time they reach the circuit's output. This assump­
tion is required to insure that a failure in a previous circuit does not 
result in several indeterminate values appearing on the inputs of a cir­
cuit. By placing a checker at the output of every circuit, this assump­
tion implies that any error in the first circuit is detected before the 
error reaches the second circuit.
When using the simplified indeterminate fault model, it is not 
necessary to consider all possible faults. Many possible faults do not 
need to be considered, since consideration of certain faults, takes into 
account all th'e effects of other faults.
Theorem 2: For any switching function, an implementation exists 
in which all failures allowed by the simplified indeterminate 
fault model may be modeled as a single ternary u value on a sin­
gle input line or output line.
Proof: Figure 4.6 demonstrates a manner in which any switching 
function may be implemented. Each output is generated by its 
own independent block of logic. Clearly, any modeled failure 
which affects an input line may be modeled by a ternary u value 
on the failed input line. Any modeled failure which occurs in 
one of the blocks of logic may be modeled conservatively as a 
ternary u value on the output line from the failed logic block. 
Therefore, for this implementation, all failures allowed by the 
simplified indeterminate fault model may be modeled by a single 
ternary u value on an input or output line.
 
r a u pli t pli o o · 
i at lt odel, su t l n i t l l
i im it' t ut. i ss1JD1.p-
io i t it t
lt r l i at e ri t i -
it.  a r t t it. su -
io pli t it t r
it. 
h  s pli i at l odeL t
si r l ssi l lt . ssi l l t
si r , si r t lt , l
t l ~ t lt . 
em 1 r it t , pl entati i t
hi ll u low pli i t
l odel a odel  -
t in t t . 
: i r • . onstrat anner hi i
t a l ented. t t r
n t l t i . l arl , odel
i h. t t n odel l, • 
le t i . odel u hi r
i a odel ser ati l
t t ino rom o e l ct.
heref re. l entati . l lu low
pli n i t lt odel a odel n&l
t t t i o. 
133
f 1
f2
Figure 4.6. Possible Circuit Implementation.
INPUT VECTOR 
[ 
LOGIC BLOCK 
1 
LOGIC BLOCK 
2 
0 
0 
0 
0 
0 
0 
LOG IC BLOCK 
m 
i r . . ssi l ir it pl entati n. 
 
134
As long as circuits are implemented in the form of Figure 4.6, the 
number of fanlts which must be considered is significantly reduced. 
Unfortunately, the implementation of Figure 4.6 is seldom the most effi­
cient implementation of a switching function. By using shared logic to 
produce two or more outputs, the total amount of logic may be signifi­
cantly reduced. Figure 4.7 shows a 4-input, 2-output circuit. Clearly, 
this implementation is not of the same form as shown in Figure 4.6. In 
Figure 4.7, the product term labeled m is used in generating both out­
puts. If it were desired to implement this function in the form shown 
in Figure 4.6, then an additional 3-input AND gate would be required. 
If W = X = Y = Z = 0 ,  then both outputs are 0. Furthermore, neither 
output is sensitized to any of the 4 inputs. Both outputs are, however, 
sensitized to the output of the gate labeled m. Therefore, a failure 
resulting in a ternary u value at the output of gate m cannot be modeled 
as any> single input or output having a ternary u value. In general, a 
fault which is sensitized to two or more outputs and at the same time is 
not sensitized to any inputs, cannot be modeled as a single input or 
output having a ternary u value.
4.3 .2. Separable Cftd$$
Codes used in totally self-checking circuits can be divided into 
two broad classes: separable codes and non-separable codes.
A separable code consists of two parts: the data vector and the 
check vector. The data portion of the codeword merely consists of the 
unencoded data. The check vector consists of redundant information. 
Therefore, decoding a separable code simply requires stripping the check
 
s o i l ent orm 1 . .
mnber ul hi ust i t .
nfortunatel , l entati iz . ldo ost i-
t l entat i ohin& t . i
or t uts, l ount i a ifi-
t . ia r . s t, t t it . l arl ,
l entati t a orm ho i r . .
i r . . ct erm i er t t t-
ts. er l ent t o orm o
i r  . . it l t t oul i .
 ...  = Y -= Z = 0, h t t t O, rt ni r , i
t t si z ts. t t t , ever.
si z t t . herefore.
l in ern o t t t  t odel
, t t t n& e n l . eral.
l hi si z w or t t a im
t si z ts. t odel l n t
t t l e. 
! 1 1  ep rable~ 
e  £ it d
: l - ar l . 
 r l sist w rt : t t r
ct r. rt or erel sist
t . t sist e t ati .
eref re. i r l i l i ripp n  
N
-
<
lx
 
N
K
I
S
 
-
<
I
X
§
 
M
X
$
 
M
 
-<
 X
 
N
-
<
$
 
-<
 x
135
f 2
Figure 4.7. Circuit Implementation with Shared Logic.
135 
w 
X 
y 
w 
y 
z 
X 
f 1 
V 
I 
z 
w 
X 
z m 
w 
X 
-y 
w 
f2 
y 
z 
X 
-y 
z 
Figure 4.7. Circuit I plementation with Shared Logic . 
13 6
vector from the codeword. Common examples of separable codes include
parity codes and two-rail codes. Any code which does not satisfy the 
definition of a separable code is considered to be non-separable. A 
common non-separable code is the k-out-of-n code.
We restrict onr attention exclusively to separable codes. Separ­
able codes are nsnally much easier to implement than non-separable 
codes. With a separable code, the data portion of the code is no harder 
to implement than a non-encoded version of the same function since the 
data portion is not altered by the encoding into a separable code. The 
simplicity of the encoding can often be a significant advantage. The 
designer is usually able to use a knowledge of the function and its pro­
perties to determine an efficient implementation. By encoding into a 
non-separable code, a function typically becomes more complicated in a 
manner that often obscures the original function. More than likely, 
this new function will be harder to implement than the original unen­
coded function. A separable code may be implemented with two indepen­
dent relatively small circuits while the implementation of a non- 
separable code requires one larger circuit. In terms of switching 
speed, two smaller circuits in parallel as in a separable code are usu­
ally preferable to one larger circuit as in a non-separable code. This 
is especially true in structured elements such as PLAs. Therefore, a 
separable implementation may be faster than a non-separable implementa­
tion. Finally, the analysis of separable codes is usually easier than 
for non-separable codes. Because of the advantages that separable codes 
offer over non-separable codes, non-separable codes are not considered
further.
 
t r rom o ord. CO!DIISo  ex l se c  includ  
ri s o-rail es, hi t o
fi i o l i -
o - ar l t o e, 
lf t t ur e io l l es. ar-
l s u ual uc i l ent - ar l
es. it l e. t rt r
l ent t o
t rt t e i l e.
pli i i i t ant e.
r al l b.o l llJlct -
rt r i t l entati n. i
- ar l e. llllct i l es or pli t
anner t r o r l t , or h ik ,
t o ,rill r l ent l -
oti .  l a l ent it n -
t v all it hil lllpl entati -
r l i it. erm i
, all r it rall l r l -
l it r l e. i
ci l ru e ent s. eref re.
r l pl entati  a r l l enta-
. i all , l si l l i
- ar l es. e s t t l
r - ar l es. - ar l t  
r. 
137
There are a variety of possible separable codes which may be useful 
in totally self-checking circuits. In addition, there are a variety of 
implementations for each separable code. One important class of separ­
able code implementations is functional duplication. A circuit is said 
to employ functional duplication if
(1) The circuit uses a separable code.
(2) The circuit layout is such that the data and check portions 
of the circuit are physically disjoint.
(3) There is a bijective (one—to—one and onto) mapping between 
the data vector and check vector of the entire output code 
space.
Note that condition (3) does not require the checker circuit to be an 
exact copy of the data circuit for functional duplication. Since the 
circuit uses a separable code and the data and check portions of the 
circuit are physically separated, then any failure modeled by the sim­
plified indeterminate fault model affects either the check portion or 
the data portion, but not both. This fact leads us to the following 
theorem:
Theorem 3_: Any switching function has a functional duplication 
implementation. Furthermore, any functional duplication imple­
mentation of a switching function, satisfies the totally self­
checking goal.
Proof: First we prove the existence part of the theorem. Con­
sider any arbitrary switching function f. It is possible to 
construct a circuit C that implements the function f. Consider 
a circuit C' which is formed by two distinct and physically dis­
I 
I I 
,, 
\ 
r 
i 
 
er ri ssi l l hi a f l
f-che i it . it . r
l entati l e. port t r-
l l entati t l nl ion it
pl t l li  
it r l e. 
i a t t rt
it ysi all i j i t. 
er i  - -o t ) appi
t t t r t t t
 
ot t dit t i er it
t t it t l pli t .
it l t rt
it ysi all r t . odel -
l e i at l odel t rt
t rt . t t t . i t low
~ edl: 
e l   it t b t l li
l entati . r ore . t l pli t pl -
entat it t , l -
al. 
l.i:w  ir t ten rt h . -
i  it t o     i  1  
st t it  t l ents t . onsi er
 it ' hi orm i t i -
138
joint copies of C. One of the copies of C represents the data 
portion while the other copy represents the check portion of C'. 
Clearly C' employs a separable code where there is a bijective 
mapping between the check and data vectors of the code. Furth­
ermore# we have specified that C' is constructed with disjoint 
data and check circuitry. Therefore# circuit C' is a functional 
duplication implementation of switching function f.
From the definition of functional duplication# any modeled 
failure affects at most one portion of a functionally duplicated 
circuit's output. If a failure occurs in the data portion# then 
the check vector is always correct while the data vector may or 
may not be correct. Likewise# if a failure occurs in the check 
portion, the data vector is correct while the check vector may 
or may not be correct. The bijective property of functional du­
plication assures that any failure that causes either the data 
vector or the check vector (but not both) to be incorrect is 
detectable.
If a failure occurs which is undetectable# then the circuit 
is transformed into a new circuit which still employs functional 
duplication and still implements the same function as the origi­
nal circuit. The next modeled failure which occurs is either 
detectable# in which case it is detected when the first non­
codeword output is produced, or it is undetectable and the cir­
cuit is once again transformed into a new functional duplication 
circuit which continues to implement the original switching
t i . i t t
r hil t r '
l rl ' pl r l her l i
appi t t r e, rt -
ore, i t ' h st it i t
t o it . eref re. it c• t l
li t l entati i t , 
o fi i o t l pli t . odel
lu ost r t l li
it' t ut. r rt , h
t r r t hil t a
a t rr ct. i e ise, lu r
rt . t t r r t hil t r a
a t rr ct. i ert .ct .al -
l o t u t su o t
t t r t t t ) r t
t t l . 
r hi detectable, h it
ransform it hi pl s t l
li t l ent a UDcti i-
l it. u::t odel hi r
t t l . hi t -
r t t . et ct l -
it i ransform t l li t
it hi t l ent l i  
 I 
, . 
I 
i 
139
function. This process is continued until a detectable failure 
occurs and a non-codeword output results. Therefore, the first 
incorrect output is a non-codeword and the circuit thus satis­
fies the totally self-checking goal.
Corollary 3.: If a functional duplication implementation contains 
no redundant logic (with respect to the input code space), then 
it is totally self-checking with respect to the simplified in­
determinate fault model.
Proof: From Theorem 3, we know that the first incorrect output 
from a functional duplication circuit must be a non-codeword. 
Therefore, the circuit satisfies the fault-secure property. 
Since the circuit contains no redundant logic, any modeled 
failure which occurs must be detectable. Since the circuit sa­
tisfies both the fault-secure and self-testing property, then it 
must also satisfy the totally self-checking property. There­
fore, the circuit is totally self-checking with respect to the 
simplified fault model.
From Theorem 3 and Corollary 3, we know that a totally self­
checking implementation exists for any switching function. Unfor­
tunately, functional duplication requires roughly a 100 percent increase 
in both area and power dissipation. When the additional circuitry for 
the checkers is included, this increase is significantly above 100 per­
cent. Therefore, a question, which we now examine, is: under what cir­
cumstances do totally self-checking implementations exist that are more 
economical than functional duplication?
J 
J 
. l 
I 
' 
t . hi t n til t t l
r r t t lt . herefore.
r t t t r it  -
f-ch al, 
oroll l: t l li t l entati t i
t i t t e).
f-che it t pli -
nni t lt odel. 
: o e , t r t t t
rom t l li it ust ord,
eref re, it t -se ert .
i it t i e t i , odel
u hi r ost t t l . it -
t -se e in ert ,
ust  f-che ert . er -
. it f-che i t
pli l odel, 
 
o e oroll , t l -
l entat i t i t . nfor-
t l . t l li t i l t
t er i t . he it l i
r , i t r-
t. erefore, est , hi ine, er hat i -
st f-che l entati i t or
ical t l li o  
140
4.3.3. Finding jkfiflftPicftl Ig-tftUx Self-Che eking Implementations
To determine which of several implementations of a given function 
is most economical * it is usually necessary to layont each implementa­
tion. The area of the circuit may then be determined and a circuit 
simulator such as SPICE may be used to estimate the circuit’s power con­
sumption. The simulator may also be used to determine the speed of the 
circuit. Although this technique assures that we always use the most 
economical of the several implementations * it is not in general practi­
cal. For any given switching function, there is a large number of 
implementations which must be considered. The average circuit 
designer’s productivity in industry may be as low as 5 - 10 transistors 
per day [62]. Even when structured designs and extensive design automa­
tion software is used, the designer productivity may still be less than 
40 transistors per day [63]. For large integrated circuits containing 
hundreds-of-thousands of transistors, implementing a large number of 
alternative designs is quite clearly not practical. Instead, we con­
sider the cost of an implementation to be completely determined by the 
number of bits the circuit must process. Under this assumption, the 
cost of an implementation depends solely on the code it uses. By making 
this assumption, we are shifting the problem from finding the most 
economical implementation to finding the most economical code. Since 
there are many possible implementations of a given code, once a code has 
been selected, a good implementation of the code must still be deter­
mined. It is usually much easier to determine which implementation of 
one code is more economical than to determine which implementation among 
different possible codes is more economical. It does not always follow
 
!,J. ,1. i Economi a To ally kl.f.-C c c pl entati  
er i hi r l l entati o
ost ical. al ut l enta-
io . b it a lt er11l it
■ a ■ im t i ' er -
pti . a l a r i
i it. l ec ■ t a 11.ost
ical r l l entati s, t eral r cti-
l. r i t . h ber
l entati hi ust si r . oi i  
g '  ucti i n t a o r i
r ). B h ru u % ■ o a-
io r . r ucti i a
ra r . r s it t
dreds- f h r i r , l enti llDlber f
e i t r cti al. . -
st l entati plet l r i
ber i it ust ess. nder s'Ulllption.
st l entati l s. ati
pti . i in e rom n ost
ical l entati in n ost ical .  
a ssi l l entati e,
. l entati ust t r-
i ed. h al u i i hi l entati
or on0111lcal r i hi l entati U1.o
t ssi l or ical. t low 
\ . 
141
that if code A is more economical than code B, then a "good" implementa­
tion of code A is more economical than a "good" implementation of code 
B. Nevertheless, it is reasonable to expect this usually to be true. 
We therefore, restrict our attention to finding economical codes.
Since all codes that we consider are separable, all codes for a 
given function have the same number of data bits. Therefore, when con­
sidering the relative economy of several codes, only the number of check 
bits for each code needs to be considered. We define the cost of a code 
to be the number of check bits in the code. The code with the lowest 
cost is considered to be the most economical.
Theorem 3 guarantees that a functional duplication implementation 
exists for any desired switching function. If N is the number of dis­
tinct output codewords of a switching function, then the most economical 
code which may be used in a functional duplication implementation has a 
cost C* given by
C* = riog2(N)l
Any code with a cost less than C* cannot satisfy the bijection require­
ment for functional duplication. Therefore, we are interested in find­
ing codes for a given function which have a cost less than C* but still 
have an implementation that satisfies the totally self-checking goal.
In searching for codes more economical than functional duplication, 
we concentrate on the fault-secure property. There are several reasons 
for this. The fault-secure property is a necessary condition for both 
the totally self-checking property and the strongly fault-secure pro­
perty. Therefore, a circuit must be fault-secure if it is to satisfy
.... 
., 
.-
I 
 
t  or ical h . h l ent -
io or ical l entati
. evertheless. l %pect al .
e r , t t r e io in n ical es. 
l t si r r l , l
o a ber t it . erefore, -
ono r l es, l ber
i si r . e st
ber i e. est
st si ost ical, 
e r t t t l pli t l entati
i t i i t . ber i -
t t t or i t , ost ical 
hi a t li l entati
st c•  
c• flog2(N  1 
i  st c• t i o -
ent £ r t l pli t . erefore, in -
t hi st h c• t
l entat t t  f-che l. 
or ical h t l pli atio ,
centr t -se n ert . er r l
i . l r ert it t
f-che ert ro -se -
rt . heref re. it ust -sec  
142
the totally self-checking goal. For circuits without hazards, it is 
possible to determine whether a code for a given function is fault- 
secure without knowing the details of the implementation except that it 
is of the form of Figure 4.6. On the other hand, whether a function 
satisfies the self-checking property is strongly dependent on the imple­
mentation. The procedure we use to search for codes with a cost less 
than C , but which still satisfies the fault-secure property, only 
depends on the switching function that we desire to encode. Since a 
hazard-free implementation of the form shown in Figure 4.6 always exists 
for any code, this procedure may be used to find whether a fault-secure 
implementation exists with a cost less than C*.
The search procedure we propose is now demonstrated by an example. 
Figure 4.8 shows the truth table for a full adder circuit with inputs X, 
T, and Z, and outputs C and S. The question we wish to answer is 
whether or not a fault-secure implementation exists with a cost less 
than C*. For this circuit, there must be 4 output codewords. There­
fore, C* = 2. Each codeword consists of a data vector and a check vec­
tor. Codewords are formed by combining each code vector with certain 
data vectors. Not all combinations are allowed. The allowed combina­
tions are the codewords while the disallowed combinations are non- 
codewords .
In order for the check vector of the code to require fewer bits 
than C*, there must be some check vectors which may be combined with 
more than one data vector to form codewords. Ve make the assumption 
that for all codewords, a given data vector only has a single possible
 
o f-cho al. r it it t :r.ards. h 
ssi l i het er  o s ven. o h lt
o it t ln i 1 o t il ha e e .tati t t
ho a i r  . . r . het er o
o f-cho ert a ro e t o pl -
a t t . o r ,r 1 it st
-. t hi a t oa o -sec ert , l
o it t t o si e.
r -f l entat rm ho is s i t
 e. r a  in het er -se
l entati %ist it at o c•. 
r a onstrat ple.
is r l r it it t ,
Y. . t t  .  i er  
het er t  -seo o ha l entat u: it  st aa
c• r it, ust t t ords. er -
. c• . B or sist t t r o -
r. or s rm bi i t it r
t ct rs, ot l binati , lo . low bi a-
io or s hil o llow binati -
ords  
o t r i r i 
h c•. ust o  t r hi a bi it
or t t r o: m ords. l a  u .pt 
t l ords, t t l ssi l  
I 
r 
•• 
-
,. . 
X Y 1 Z 1 CS 1
0 0  1 0  1 0 0  I
0 0  1 1  1 0 1  1
0 1  1 0  1 0 1  I
0 1  1 1  1 1 0  1
1 0  1 0  1 0 1  I
1 0  1 1  1 1 0  1
1 1  1 0  1 1 0  1
1 1  1 1  1 1 1  1
Figure 4.8 Full Adder Example
• - I 
.. 
..... 
 y z cs
 
 
I o I I o I 
Io I I I  
I ~ I ~I~ I ~~
I  I I o I  
1 11111
ll dder pl . 
143 
144
check vector. In other words, the data vector of a codeword implies the 
check vector of a codeword. Violating this assumption never reduces the 
cost of a code since by violating the assumption, we are increasing the 
number of distinct check vectors which the code must include. The main 
reason for this assumption is that it means we only need to consider 
failures in the data portion of the function.
In order for the fault-secure property to be satisfied, a failure 
must either be undetectable (in which case the correct output is always 
produced) or detectable for some input (in which case the circuit must 
produce a non-codeword output). When a failure occurs in the data por­
tion of the circuit, then the code must be designed so that it is never 
possible for the data vector of one codeword to be transformed into the 
data vector of another codeword. If this transformation is allowed to 
happen, then the failure has caused an incorrect codeword output to 
occur and the circuit is thus not fault-secure. If a failure occurs in 
the check portion of the circuit, the check vector is either correct (in 
which case the output codeword is also correct), or the check vector is 
incorrect. We have assumed that the data vector implies the check vec­
tor. Therefore, if the check vector is incorrect, then the output from 
the circuit is a non—codeword. Consequently, failures in the check por­
tion of the circuit automatically satisfy the fault-secure property.
Returning to our example of Figure 4.8, we must find a code which 
requires fewer than 2 check bits. The code must be selected such that 
if a fault in the data portion of a circuit can cause the data vector of 
one codeword to change to the data vector of another codeword, then the
 
ct r. ords, t t r r pli o
t r I ord. i l t a pt r
st i l a o pti .  n
ber i t t r hi a s . ai
i sn pti oa s o l o s
lu t rt t . 
r -sec ert t ,
ust etect l hi r t t t
) t l o t  hi it ust
r t ut), he u r o t r-
io it, ust g t r
ssi l t t r r ransform
t t t r ord. ran for t low
en. o r t r t t
r it t l r , lu r
rt it, t r r t
hi  t t r rr ct), t r
rr ct. e ssUJ11e t t t pli -
r. eref re, t rr ct. h t nt rom
o it - ord, onsequentl , u o r-
io o it atical -se ert . 
et r i a r pl i r . , ust in hi
i r it . ust t
lt t rt it t
or 1 t t r t t r ord. h  
' 
I 
145
two codewords must have different check vectors. The first step is to 
determine the effect of all faults on the data portion of the circuit's 
outputs. By Theorem 2, we only need to consider failures on the data 
portion of the circuit's inputs and outputs. Let us first consider the 
effect of a fault on the input.
Figure 4.9 shows the result for the full adder example of Figure 
4.8. The first 4 columns of the table of Figure 4.9 repeats the truth 
table from Figure 4.8. The next 3 columns of the table show the effect 
of failures on lines X, Y, and Z, respectively. Each row of the table 
represents one of the possible input conditions. If a data bit, C or S, 
retains its original value when a given input is changed, that data bit 
retains its original value in the column under the given input. If a 
data bit changes its value when a given input is changed, the data bit 
takes on the value of u in the column under the given input.
In other words, we are interested in whether or not the data bit is 
sensitized to a fault on the input. If a change in the input causes a 
change in the data bit output, then the data bit is sensitized to the 
input. If the circuit contains a static hazard, then a data bit output 
may be sensitized to an input even though a change in an input does not 
cause the data bit to change. An implementation which is free of static 
hazards exists for any function although it may require redundant logic. 
Therefore, we only need to consider whether a input change leads to an 
output change to determine sensitization. We may determine sensitiza­
tion by inspection from the truth table as we have done here or more 
formally, by the Boolean difference method. If the Boolean difference
i
or ust t t rs,
t m t l l t r o it'  
t uts, e , ,r l si r u t
rt it' t t uts. t si
t l t. 
i r s lt l r pl i r
. , olUJ11ns r . t r
rom i . . t u s a ho t
in . , , t l . o
t ssi l t diti s. it,  ,
l n t e , t t i
l l ltllll er t.   
t it  t ed, t it
wn r t. 
ords, het er t it
si z lt t. t
t it t ut, t i si z
t. it t r . h t i t t
a i iz t h t t
t it e. l entati hi  
r zi t o thou a i t i ,
eref re. si r het er t e
t t r i sit . e a r i si -
io t rom r or
all , ol  fere et d. ool ere  
146
OUTPUT FOR
INDETERMINATE FAULT IN
FUNCTION X Y Z flTTTPTFT MAP
X Y 7 cs CS CS CS
0 0 0 00 Ou Ou Ou 0 1 1,2
o 0 1 01 uu uu Ou 1 0.2,3 0,-3_.
0 1 0 01 uu Ou uu__ 1 0.2.3 0.3
0 1 1 10 lu uu uu 2 0.1.3 0.3
1 0 0 01 Ou uu__ 1 0.2.3 0.3
1 0 1 10 uu lu uu 2 0 , 1 . 3 - 0.3
1 1 0 10 uu uu lu__ 2 0.1.3 0.3
1 L U 1 11 lu lu lu 3 2 1*2
Correct
Output
0
1
2
3
Possible Faulty 
Outputs 
1,2 
0,2,3 
0,1,3 
1,2
I
Figure 4.9 Fault Behavior of Full Adder*.
' ^
 
ITRll  
L__ IO '  I y z 
I I IZI  cs I cs cs I g I g I ~ I  :: I ~: ~:  
I  I l I o I  Jl,Jl I :g. JUl 
I !I ! I  I~! I :: I :: ::  l lll JUI 
I  I  I  I 
I  I l I  I 
 I  
 I g 
r  
t  
0 
1 
2 
3 
I 
I 
 l:1 
llli l]l 
l
t t
1,2 
0,2,3 
0 1,3 
1,2 
OUTPUT MAP 
 I   016•3 
 I Q,213 
i I 1 ,   . .  
 I ,113 
2 I 1 1  
 I 2 
Fiaure ~.9 F  B or of F l A er . 
I 
1.2 I -,3 
013 I 
o.~ I ,  
0.3 I 
0.3 I 
1.2. I 
147
is 1, then the output data bit is sensitized to the input and the output 
data bit is set to u. If the Boolean difference is 0, then the data bit 
retains its original value. It must be emphasized that using the 
Boolean difference method (or equivalently determining from the truth 
table whether or not an input change causes an output change)# is only 
valid because of the assumption that the implementation has no static 
hazards.
In Figure 4.9, on the first row, all columns have a value of Ou. 
This implies that if the input of the circuit is X = Y = Z = 0, then a 
single indeterminate fault on any of the three inputs results in output 
bit C remaining 0. Output bit S has a value of u meaning it may take on 
a value of either 0 or 1. Likewise, the second row of the table 
corresponds to X = Y = 0, Z = 1. In this case, a single indeterminate 
fault on either input X or input Y results in output bits C and S both 
having a value of u. A single indeterminate fault on input Z results in 
output bit C having a value of 0 and output S having a value of u. A 
simple indeterminate fault on either input X or input Y can result in 
output bits C and S being either 0 or 1. Other rows are similarly con­
structed.
The three columns under the heading "OUTPUT MAP," represent the 
correct output, all outputs that may be produced if a single indeter­
minate fault occurs on any one of the inputs, and all outputs that may 
occur due to a single indeterminate fault on any one of the outputs. 
Note that the output vector CS is treated as an unsigned two-bit binary
.f  
, h t t t i si z n t t t
t it t . o ol fere , h t it
l l . ust phasi t
ool fere et i al t ot nni rom ro
het er t t t t e), l
l pti t l entat
 
i r  -t. , , l u u.
i pli t t it I .. , h
n i at lt t lt t t
it  ai i O. ut t it eani a a
l O . i ,riso, o
sp I , . , n i at  
l t I t  lt t t i  t
i . n i t l t lt
t t it  i O t t hav n .  
l n  er111i  l t  t  lt
t t i th O , t er o il -
. 
olUJ11ns r i O , t
r t t ut. l t t t a o l t r-
i at l r ts, l t t t a
r n ! n  er: i  t l t uts.
ot t t t t r re e -bit  
!number. To determine the possible faulty outputs, all u fs in the output 
vector are replaced by all possible combinations cf 0 ps and l ps.
We have now considered the effect of a ternary u value on any of 
the circuit inputs. It is still necessary to consider the effect of a 
ternary u value on any of the circuit outputs. Output faults may be 
considered by taking each of the possible output vectors one at a time 
and complementing each of the output bits. For example, if the correct 
output is 0, an output fault may result in an output vector of 1 or 2. 
If the correct output vector is 1, then an output fault may result in an 
output vector of 0 or 3. If the correct output vector is 2, then an 
output fault may also result in an output vector of 0 or 3. Finally, if 
the correct output vector is 3, then an output fault may result in an 
output vector of 1 or 2.
The list at the bottom of Figure 4.9, gives a summary of all errors 
due to any single indeterminate fault on an input or output line. Below 
the table, the fault behavior is summarized. Each of the four possible 
correct outputs are listed along with the faulty outputs that they may 
be changed into. From the summary, we see that any time the correct 
output vector is 0, then with a fault on one of the inputs or outputs,
we may get an output vector of either 0, 1, or 2. When the correct out­
put vector is 1 or 2, we may get any output vector. When the correct
output vector is 3, we may get an output vector of 1, 2, or 3.
It is now necessary to assign a check vector to each of the possi­
ble data vectors. In order to keep the cost of the code as low as pos­
sible, it is desirable to assign as many of the data vectors to a single
148
*
 
ber, r i ssi l l t uts. ' t t
t r a l ssi l binati o O' 's , 
e si t  l 011
it ts. si r oot
 o it t uts. ut t l a
a 1 o ul o t t t r im
pl enti  o t t i . r ple, r t
t t O. t t l a lt t t t r .
r t t t t r , h t t lt a lt
t t t O  , ho r t t t oot ,
t t l a lt t t t r O . i all ,
r t t t t . h t t lt a al t 
t t t r . 
t tom i r ◄ . SWIUll&ry l
s i t l t t t o.
o l , lt i r arized. ssi l
r t t t ste o it o l t t t h
. o &1Dlllllary, t im ho r t
t t t r is O  then h a fault on one of t o t t uts, 
e a t output vector of eit er 0, , r 2. hen t e co rect ut-
t t r or 2, ,re may et  t ut ector. he r t 
t t t r is 3   1 an output vector of , , . 
s t r ssi-
t t rs. r st o s-
, i l a t t  
149
i
l
l
check vector as possible. On the other hand, two data vectors cannot be 
assigned to the same check vector if a fault may transform one data vec­
tor into the other data vector.
A set of data vectors is said to be compatible if no member of the 
set may be transformed by a fault into another member of the set. The 
problem is to determine the fewest sets of compatible data vectors such 
that each data vector occurs in exactly one set. Each set of compatible 
data vectors is assigned a unique check vector.
In Figure 4.10, a merger diagram is drawn as a graphical aid to 
determine the fewest sets of compatible data vectors. The merger 
diagram has a node for each data output vector. An arc is drawn between 
each pair of compatible data output vectors. A set of nodes is compati­
ble if and only if every node in the set is connected by an arc to every 
other node in the set. From the merger diagram, we see that at least 
three sets of data vectors are required. Data vectors 0 and 3 form a 
compatible set since a correct data vector of 0 can never be changed by 
a fault to 3 and a correct data vector 3 can never be changed by a fault 
to 0. On the other hand, data vectors 1 and 2 must each be in a set by 
themselves since a fault may change these data vectors to any of the 
other possible data vectors. Since at least three sets of data vectors 
are required, there must be three distinct check vectors. Therefore, no 
code exists with a cost less than 2. Since C* for this function is 2,  
no code exists which is more economical than the most economical func­
tional duplication code.
\
I 
I 
 
t r ssi l , , w t t r t
g t r lt a ransform t -
r ct r. 
 t t r nati l e be
t a ransform l t r e ber t.
em r i est pati l t t r
t t t r ct t. t pati l
t t r g t r. 
i r   , . er er agram a i l
r i est t pati l t rs, er er
agram t t t ct r. aw
i pati l t t t ct rs.  t pati-
l l t t
r t. o er er , t t
t t t r i . at t O orm
pati l t r t t t O r
lt  r t t t  r l
. , t r ust t
he  1 lt a t t r h
r ssi l t ct rs. t t t
ir , ust i t t rs. eref re.
i t i st h . c• t o
i t hi or ical h ost ical -
o l li e. 
150
o
Figure 4.10. Merger Diagram for Full Adder Example.
 
0 1 
0 
0 
3 2 
i r . . erger i a ll dder a ple. 
151
The procedure for finding codes that are both more economical than 
functional duplication and have implementations that are fault-secure 
with respect to the simplified indeterminate fault model may now be sum­
marized as follows:
(1) Construct a truth table for the desired switching function.
This function is implemented by the data portion of the circuit.
(2) For each possible data input vector, determine the possible 
incorrect data output vectors that may result from a fault on a 
single input.
(3) Summarize the results from step 2 to obtain a list of each 
correct data output vector and the incorrect data output vectors 
that may result from a fault on an input.
(4) Update the list from step 3 to include the effects of faults 
on output lines.
(5) Determine the minimum number of sets of compatible data out­
put vectors required so that each output vector is included in 
exactly one set.
(6) The minimum number of check bits is the smallest integer
which is greater than or equal to the log2 of the minimum number 
of compatible sets.
When the minimum number of sets of compatible data vectors has been 
found, then each set of data vectors must be assigned a unique check 
vector. This assignment is completely arbitrary as it has no effect on 
either the fault secure property or the cost of the code. Therefore, 
the assignment may be done so as to minimize the cost of the implementa­
tion of the disjoint check bit generation logic.
In the example of Figure 4.8, it is possible for all data input and 
output vectors to occur. In other cases, it is possible that one or
 
r n n t t or ical
t l pli t l entati t -sec
it t pli n ermi t l odel a -
arit lo s: 
onstr ct r i i t . 
i o l ent t rt it. 
r ssi l t t ct r, r i ssi l
r t t t t t t a lt rom lt
l t. 
 ) mariz l rom t
r t t t t t r t t t t t
t a lt rom l t. 
pdat rom u t l
t t . 
S et i i imu ber pati l t t-
t t r i t t t t n u
ct t. 
i im ber i a l st r 
hi t r al o 2 i imum berpati l t . 
he i imu ber t pati l t t r
, t t t r ust ig i
t r . i ign e t plet l i  b t
lt ert st e. er f r ,
ign t a i i i t l ent -
io i t it er t i . 
pl i r . , ssi l t t
t t t r r. r s, l t  
152
more of the input or output vectors cannot occur in normal operation* 
In such a case, this procedure may still be used. Any data input vec­
tors which are not used should be left out of the truth table. Any data 
output vectors which do not occur during normal operation may also be 
ignored since the checker may be designed to recognize any unused data 
output vector.
This procedure may be used to detect whether or not a code exists 
which has a cost lower than the most economical functional duplication 
code. Furthermore, if such a code is found by the procedure to exist, 
then there is always a fault-secure implementation of the code. Since 
the fault-secure property is necessary for all circuits that meet the 
totally self-checking goal, then if a function is found to have no code 
more economical than functional duplication for meeting the fault-secure 
property, the function also does not have a more economical implementa­
tion which satisfies the totally self-checking goal.
We have assumed that the implementation has no static hazards. In 
many cases, this requires the addition of redundant logic. This logic 
has important implications if the desire is for the implementation to 
satisfy the totally self-checking goal. If redundant circuitry is 
added, then the circuit cannot satisfy the totally self-checking pro­
perty. However, the circuit might not satisfy the strongly fault-secure 
property. For such fault-secure circuits which are not strongly fault- 
secure, fault detection cannot be guaranteed for some fault sequences.
The procedure we have proposed does not take into consideration any 
static hazards which may exist in the implementation when examining
 
or t t t t t r al erati .
••• r a . t t -
hi t l o t o l . /m t
t t t r hi t r al r t a o
g r a o 1 o i t
t t ct r. 
i r a a t t het er t i t  
hi b st o er h ost ical t l pli t
e. r ore, a o r ist,
-sec l entati o e. i  
-se ert it t eet  
f-che al, t  
or ical t l li t eetl s -se
ert . t o  t or ical  ent -
io l i t o f-cho al. 
~ sum t l entati ar s.
a s, i it t i . hi i
ort t pli t si l entati  
f-cho l . t i
, it t f-che -
rt . o ever, it i ht t ro l r
ert . r -se it hi t ro s lt
re, l t t s lt ces. 
 r t si r t
hi a i t l entati h os i i  
153
whether a fault may propagate from an input to an output. In some 
cases, a static hazard does not destroy the fault-secure property. Each 
static hazard in the circuit allows a fault to propagate from an input 
to an output. Such a static hazard causes a correct output vector to be 
transformed to another incorrect output vector. In some cases, this 
transformation occurs for some other input vector, regardless of whether 
the static hazard exists. In this case, the static hazard does not 
affect the fault-secure property of the implementation. In other cases, 
the static hazard causes the transformation from a correct output vector 
into an incorrect output vector that does not otherwise occur. If this 
transformation causes one of the sets of compatible output vectors to 
become incompatible, then the static hazard must be removed from the 
implementation.
Redundant logic is often required to remove a static hazard. Occa­
sionally, the situation arises in which the only way a code may be 
implemented so that it satisfies the fault—secure property is to add 
redundant logic to the implementation. In this situation, the implemen­
tation is fault-secure, but it is not either totally self-checking or 
strongly fault—secure. The redundant logic which is added to remove the 
static hazard is not testable. Therefore, if the redundant logic fails, 
then the static hazard exists once again, but it is impossible to test 
for all failures in redundant logic. Since the redundant logic was 
added to make the circuit fault-secure, then the failure of this redun­
dant logic causes the circuit not to be fault-secure. We now have a 
situation where a failure has occurred that cannot be detected by test­
ing. In addition, the circuit is no longer fault-secure so that the
 
het er lt a at rom t t ut.
s. t -se n ert .
it lo l at rom n t
t ut. r t t t t r
ransform t r r t t t t r. o s,
ran for t r  o t t r. l het
::a::i t . , t
t -se r l entati . s.
 ran for t rom r t t t t r
r t t t t r t t i r.
ransfor t pati l t t t r
n patible, ust em rom
pl entati n. 
e undant i e r . cca-
all , tu io hi l a
l ent t t l -sec ert
t i pl entati . tu . -
io l s r , t t f-chec
ro l -secu e. e t i hi e
t l . heref re, e t
h i t i , t possi l t
l u t i . i t as
a it s ,  -
t i it t n  r . e
tu o her u r t t -
. it , it o r -sec t  
154
next fault that occurs may cause an incorrect codeword output- This is 
an example where the first incorrect output is a codeword output. 
Therefore# such a circuit does not satisfy the totally self-checking 
goal- From this argument# we see that if the desire is to build cir­
cuits which have the strongly fault-secure (or totally self-checking) 
property# then redundant logic should not be used to remove static 
hazards.
In many instances# it may be desirable to use implementations which 
are not of the form of Figure 4.6, Often by sharing logic among several 
outputs# the amount of logic required for an implementation is signifi­
cantly reduced. In many cases# sharing logic between several outputs 
results only in faults which can be modeled as a single fault on an 
input or output. In other cases# the shared logic causes faults which 
cannot be modeled as a single fault on an input or an output# but 
nevertheless# no sets of compatible output vectors become incompatible 
due to the sharing of logic. In both of these cases* the sharing of 
logic does not affect the fault-secureness of the implementation. In 
other cases# sharing of logic results in compatible sets of output vec­
tors becoming incompatible. In cases where one or more sets of output 
vectors become incompatible# the resulting implementation is not fault- 
secure.
Simulators are usually the most practical method of evaluating the 
effect of static hazards and sharing logic on an implementation. Ter­
nary simulators are quite straightforward to implement [57]. If the 
circuit implementation is of the same form as Figure 4.6 (no shared
' 
t lt t r a  r t r t ut. i •
pl her o r t t t r t ut. 
heref re, it t f-che
al. P o 1 ent, a o t i i -
it hi o ro -so f-ch )
ert . h t i l t a e
ar s. 
a s. a i l l entati hi  
t orm i r . . ft r s i r l
t uts. ount i i l l .entati ifi-
t . a s. oB C w en. r l t t
lt l l hi odel l lt
t t ut. r s, i l hi
t odel lt t t ut, t
ert eless. pati l t t t r n pati l
i . t  s,
i t t  $ l entati .  
s, i l pati l t t t -
i patible. her or t t t
t r n patible, l n l entati t lt
re. 
ulat r al ost r ti l et l t  
t i l entati n. er-
ul t r i raightforw l ent ].
it pl entati 1 orm i r  .  
155
logic)# then the simulator can be used to determine when a u value on an 
input causes a u value on an output. The simulator can determine when a 
u value propagates from input to output of an implementation, regardless 
of whether static hazards exist. A ternary simulator may also be used 
to study the effect on an implementation of sharing logic among its out­
puts.
The procedure we have outlined in this section can be used to 
search for codes that are more economical than the most economical func­
tional duplication code. If a functional duplication code is found to 
be the most economical code# then it may be implemented without any con­
cern about static hazards or the sharing of logic between outputs in the 
implementation. If another code is found to be more economical, then it 
may be implemented in the form of Figure 4.6 and a ternary simulator may 
be used to check for the presence and effect of static hazards. If an 
implementation that shares logic among the outputs is desired (i.e.# the 
implementation is not of the form of Figure 4.6), then the simulator may 
also be used to determine the effect of shared logic. For non­
functional duplication codes, any static hazard or sharing of logic 
which causes sets of compatible output vectors to become incompatible 
must be removed or else the code must be modified so that the sets of 
outputs are split into smaller outputs. Obviously removing a static 
hazard or using separate logic to calculate each output requires extra 
logic. Modifying a code by splitting sets of compatible outputs, may 
also require additional logic. It should be noted that for functional 
duplication codes, hazards and shared logic are not a concern. The 
bijective property requires that each set of compatible output vectors
 
i ), l t r i u
t n t ut. ul t r r i
n at rom t t t e l entati , r l
het er r ist. ern l t a
u t l entati i t-
ts. 
r  t n
t or ical ost ical -
io l li t e. t l li o o
ost ical e, a l ent it t -
t t t
l entati n. t r or ical, h
a l ent orm i r . e ulat r a
t r s.
l entati t t t .,
l entati t orm i r , ), h ulat r a
i t i . r -
t l li t es,
hi pati l t t t r n pati l
ust e ust ciodifi t
t li all r t uts. bvi usl e i
r t l t t t i t
i . odif i l in pati l t uts, a
i it l i . l t t t l
pli t es, t cer .
i ert i t t pati l t t t r  
15 6
have only one member. Therefore, for functional duplication, the sets 
of data vectors always remain compatible.
The full adder example is a case where no code more economical than 
functional duplication exists. There are other functions, however, 
where codes more economical than functional duplication do exist. Fig­
ure 4.11 shows the truth table and fault behavior of a two-bit, vector 
AND function. From the list of input and output fault behavior, it is 
clear that only two sets of compatible output vectors are required, 
{0,3} and {1,2}. For any such vector bitwise function, regardless of 
the length of the input vectors, a fault under the simplified indeter­
minate fault model may only affect at most one output bit. Therefore, 
any possible erroneous vector will be distance 1 away from the correct 
output vector. To detect such errors, it is only necessary for every 
output codeword to be at least distance 2 away from every other output 
codeword. A one-bit parity code is an excellent choice for such a code. 
Consequently, for any bitwise vector operation, fault-secure operation 
with respect to the simplified indeterminate fault model may be provided 
at a cost of only one check bit.
4.3.4. Check Ve_c.toc ff-gJLcrfrti<?h
If the circuit we wish to design accepts unencoded inputs, then the 
generation of the check vector presents no particular difficulties. The 
unencoded input vector is fanned-out to both the data and check portions 
of the circuit. Since the entire input vector is available to the check 
portion of the circuit, the generation of the output check vector is 
straightforward. Unfortunately, if there is a fault on an input line
 
l e ber. eref re, t l pli t . t
t t r e ai patible, 
l r ~ua lo her or ical h
t l pli t iat . or t s. ever,
her or oai al h t l li t ist. i -
-4. s r l a i r o-bit, t r
ct . om. t . t t lt avior,
t l pati l t t t r i .
( ( . r t r i i t , l
e t t rs, lt er pli t r-
i at l odel a l t t ost t t it. erefore,
ssi l o t r ill a rom r t
t t ct r. t t r . i l
t t r t an rom t t
ord.  e-bit ri 1 ell t i e.
onsequentl . i t r erati , -sec r t
it t pli i at lt odel a
t st l it. 
i-1-i• ~ e t r Generation 
it i t ts, h
er t t r t rt l i lt .
t t - t t t rt
it, t t t r i l
rt it, 1 er t  t t t r
raigh for r . nfort natel , d t in  
157
OUTPUT FOR
INDETERMINATE FAULT IN
1 X X Y Y 1
1 FUNCTION 1 0 1 0 OUTPUT MAP 1 
11 x X 1 Y Y s S s S S
1 1 0 1 1 0 -4
i o 0 1 0 0 00 00 00 00 00 0 1 1 1,2 1
1 o 0 I  0 1 00 00 Ou 00 00 O i l  1 1,2 1
1 o 0 1 1 0 00 u0 00 00 00 0 I  2 1 1,2 1
1 o 0 1 1 1 00 uO Ou 00 00 0 1 1,2 1 1,2 1
1 o 1 1 0 0 00 00 00 00 Ou O i l  1 1,2 1
1 0 1 1 0 1 01 01 Ou 01 Ou 1 1 0 1 0,3 1
1 0 1 1 1 0 00 uO 00 00 Ou 0 1 1,2 1 1,2 1
1 o 1 1 1 1 01 ul Ou 01 Ou 1 1 0,3 I  0,3 I
1 1 0  I 0 0 00 00 00 uO 00 0 1 2 1 1,2 1
1 1 0 j 0 1 00 00 Ou uO 0 0 0 1 1,2 1 1,2 1
1 1 0  I 1 0 10 uO 1 0 uO 10 2 I 0 1 0,3 1
1 1 0  j  1 1 10 uO lu uO 10 2 1 0,3 1 0,3 1
1 1 1  1 0 0 0 0 0 0 0 0 uO Ou 0 1 1,2 1 1,2 1
1 1 1  I  0 1 01 01 Ou ul Ou 1 1 0,3 1 0,3 1
1 1 1  1 1 0 10 uO 10 uO lu 2 1 0,3 1 0,3 1
1 1 1  1 1 1 1 1 ul i» ul lu 3 1 1,2 | 1,2 |
Figure 4.11 Vector AND Example
? 
 
N  
I  l XO f yl Yo I   
I X y y  s l s s s 
I    
I 0  I       .
I 0  I       0 1 .
I   I   O    
I 0  I       
I 0  I       0 1  
I   I       
I o  I o  0    . 
I 0  I   l    I 
I   I     0  I  
I   I       I , ,
I   I o  0    I 
I   I     0  I , ,
I   I       I .  
I   I     l  I ,
I   I     0 n I ,
I   I   yl lu y   I .
i r ect   pl . 
158
before the input vector is fanned-out to tie data and check portions of 
the circuit* then the fault may cause an undetected error since the 
incorrect value is passed to both the data and check portions of the 
circuit.*
The generation of the check vector becomes more complicated if the 
circuit is part of a larger totally self-checking system. In this case, 
the check vector must be generated in such a manner that no modeled 
failure on an input violates the fault-secure property.
Figure 4.12 demonstrates three possible ways of generating the 
check vector that are compatible with the philosophy of separable codes. 
Method A of Figure 4.12 has the advantage of being very simple. In this 
method, the data input vector is used by both the data and check por­
tions of the circuit. Unfortunately, as we have just shown, this method 
cannot protect against input line faults. Method B is also very simple. 
In this method, the data input vector is used to calculate the data out­
put vector and the check input vector is used to calculate the check 
output vector. Note that the check output vector of the previous func­
tion forms the check input vector of this function. The drawback to 
this method is that there may not be enough information in the check 
input vector to calculate the check output vector. It should be noted 
that the bijactive property of functional duplication guarantees that 
method B may always be used with functional duplication. Method C is 
more complicated than either method A or method B. In method C, the 
data input vector is used to calculate the data output vector and the
*
We are assuming that none of the data bits is redundant.
.  
f r n t t r t h t rt
it, l a et t
r t l h t t rt f o
i • 
er t t r es or pli t ll
it rt r f-ch e . ,
o t r ust a r a .nor t odel
t i l o l -se ert . 
i r • 1 onstrat ssi l a s er t
t r t pati l it o ■ l es.
et  i r • 1 t a ple. i
et d,  t t t r t t r-
io it. nfortunatel , t n, otb  
t t t i st t in lt . et i ple.
et d, t t l t t t-
t t o o t t r l l t o
t t ct r. ot t o t t t r o -
io or t t r t . o
et a t a t n or at
t t r l l t t t t r, l t
t i  oct ert t l li o ar t t
et  a it t l pli t . . et  
or pli at et  et . et ,
t t t l l t o t t t t  
• l u i t o o t i ant. 
159
METHOD A
METHOD B
METHOD C
Figure 4.12. Three Methods of Check Vector Generation.i r  . . 
MTA 
MTA 
-
--
M TA 
,_, -
- ----------, -
.... 
'""" 
-
MTA 
'""" 
 
 
-
C11S111 
- ----------, 
 
_ _. 
-
-
1m 
-
-
r et s ect r enerati . 
 
160
data input and check input vectors are used to calculate the check out­
put vector. Since the check circuitry must process both the data input 
and check input vectors* method C generally requires more logic than 
either method A or method B. For this reason* method B is preferred 
whenever it is feasible.
One of the advantages of using separable codes is that* in general* 
single faults only affect either the data vector or the check vector, 
but not both. If method C is used for generating the check vector then 
a single fault may affect both the data and check portions of the cir­
cuit. Ideally, if method C is used to generate the check vector, then 
it would be desirable to design the check portion of the circuit so that 
no single failure on one of the data input bits causes both the data and 
check portions of the circuit to produce erroneous output vectors. If 
both the data and check output vectors may be in error, it is very dif­
ficult to determine whether the circuit violates the fault-secure pro­
perty. In some cases, it may be possible to design both the data and 
check circuits so that even when both data and check vectors are 
incorrect, the output vector is a non-codeword. In general, this goal 
is very difficult to achieve since we must now consider the effect of 
faults in the data input vector on the check output vector. The primary 
reasons for choosing separable codes is to simplify the analysis and 
design of the circuit. If faults are allowed to cause errors in both 
the data output vector and the check output vector, then in order to 
insure that the circuit is fault-secure, we must consider the data cir­
cuit and the check circuit together.
\
 
t t o ct t t r l t-
t ct r. i i ust t t t
t ct rs, et  eral i or i
 o et et . r i . et
heneve si l . 
t l t. 1 eral.
l  l l t t t r o o ct r.
t t t . et  er t t r
 l l a t t t rt -
it, ll , et  er t o ct r,
oul si l rt it t
l  t t i t t
rt it o t t ct rs.
t t t t t r a r. i -
l i het er it l -sec -
rt . s. a ssi l t t
o ct it t t t t r
rr ct, t t t r - ord. eral. l
i lt ust 110,r si t
l t t t r o t t t r .
si l pli l si
it, l low t
t t t t r t t ct r,
t it l r , ust si r t i -
it it t er. 
161
Up to this point, only the function performed by the data circuit 
needed to be considered. We were able to ignore the details of the 
check circuit because errors were not allowed to occur in both the data 
output vector and the check output vector. If simultaneous errors were 
allowed in both the data output vector and the check output vector, then 
the functions performed by the data circuit and the check circuit must 
be considered when finding codes that satisfy the fault-secure property. 
However, the function that the check circuit performs depends on the 
code selected. In this case, a code has to be assumed, and then, the 
data circuit and check circuit together as a unit may be tested to 
determine if the entire circuit satisfies the fault-secure property. 
The additional analysis required by this process negates any advantage 
that separable codes have over non-separable codes in terms of ease of 
analysis. In addition, the code used and the function performed by the 
preceeding circuit, which produces the inputs for this circuit, must 
also be considered when evaluating the fault-secureness of this circuit. 
This requirement serves to complicate the design process further. For 
the sake of simplicity, we assume that simultaneous incorrect data and 
check vectors imply a de jure violation of the fault-secure property. 
This assumption will be referred to as the disjoint error assumption.
In many cases method B is not applicable. It is important to know 
whether or not method C is universally applicable so that it may be used 
when method B cannot be used.
Theorem 4: Method C may be used to provide a fault-secure imple­
mentation with respect to the simplified indeterminate fault 
model of the check portion of the circuit provided that the
 
i t, l t rfor111e t it
si r . e er l g t i
it er t low r t t
t t t r t t ct r, l er
low t t t t t r t nt t r, h
t form t it it ust
si in t -se ert .
owever, t o t it or
t . , u ed, .
t it it it a e e
r i ti it -sec ert .
it l l si i at t
t l r - ar l erm
al sis. it , t o form
i it, hi t it, ust
si h l t t-secure it.
hi re t plicat s r, r
plicit , su t l r t t
t r l J i l o -sec ert ,
i pti ill t m u t . 
a et  t li l . ort t
het er t et i ers l l l t a
h et  t . 
eore !: et a i -sec pl -
entati it t pli n rtni t lt
odel rt it t  
162
Hamming distance between any two data input vectors in a compa­
tible set is at least 3.*
^IQ Q £.‘ Ve prove this theorem by describing a method C implemen­
tation that satisfies the theorem. Assume that the check por­
tion of the circuit has no static hazards. As we have already 
discussed* any switching function has a static hazard free im­
plementation. Recall that added redundancy if any, does not 
jeopardize the fault-secure property. If method B is sufficient 
to provide a fault secure implementation, then clearly this 
theorem is true (i.e., simply use the method B implementation 
and have the data input vector ignored by the check generation 
circuitry).
If method B is not sufficient, then there must be at least 
one instance where the same check input vector is used by the 
check circuit to calculate two different check output vectors.
In this case, the information contained in the data input vector 
must be used to help calculate the check output vector. Let us 
call any two such data input vectors an<j . Let their 
corresponding check input vector be and their check output
vectors be and co2 » respectively. Since data input vectors
and D2 have the same check vector C^, they must belong to 
the same set of compatible data vectors. By the distance 3 res­
*
Under any circumstances, the Hamming distance between any two compa­
tible data vectors must be at least 2. Otherwise, a fault on an output 
line could transform one member of a compatible set into another member 
of the same set. This property violates the definition of a compatible 
set.
• 
i 1 cHst oo w t t t r pa-
t t t .• 
Proof: Y heorem r et  l en-
io t t h . s11Jll t o r-
io it ar s. rea
i , i t -
entati . ecall t , t
i -sec ert . et i t
i l pl entati n,   a
h rem r L . lim l et l entati  
t t t g 1 er t  
i  
et i t ffi i t, ust t t
a her o t t r
it l t t t t ct rs. 
se, ~ t t t t t r
ust l l l t t t t r . et
ll w t t t r D1 d Di- t i
esp 1 t t r Cil• i t t 
t Col 02 , cti l . i t t t r  
DI t u, h ust
a t pati l t t rs. a -
 
nder rcUD1stances, i a pa-
i t t r ust t . t er ise, lt t t 
in l ransform e ber pati l t t r e ber
ho t. i rt l fi i o pati l  
 
163
triction of the theorem, an<j must differ in at least three 
bits. Let S be the set of bit positions in the data input vec­
tor which are different in an(j . Since the minimum Hamming 
distance is 3, set S must have at least 3 members.
When a fault exists, the output may be either the correct 
codeword or a non-codeword. Precisely which non-codeword that 
is produced is not important. Therefore, as long as an in­
correct codeword is not produced, we may design the circuit to 
behave in any manner we wish when an unused input occurs. Since 
a single fault on the data input bits in S corresponds to unused 
input vectors, we are free to assign these in any way we find 
convenient as long as an incorrect codeword is not produced.
In order to insure that the check generation circuit satis­
fies the fault-secure property, the check circuit must be 
designed so that if the correct check output vector is C ^  then 
for any single bit change in 0f bits in set S, the check out­
put vector is still t This requirement follows from the dis­
joint error assumption. Likewise, if the correct check output
vector is then for any single change in D2 of a bit in set 
S,  the check output vector must remain Cq1. The check function 
may always be defined in this manner, since all pairs of data 
input vectors in the same compatibility set have a Hamming dis­
tance of at least 3.
Since the check circuit has no static hazards, when an(j
ictio h , D1 d o2 ust i t
it . et t it si o t t -
hi i t D1 d ~ - i imu i
a , t ust t e bers. 
he l ist , t t a r t
r ord. r i hi r t
t portant. eref re, o -
r t r t ce , a it
anner i t rs.
l l t t i esp
t ct rs, in
eni t o r t or t e . 
t er t it t -
l -se ert , it ust
g t r t t t t r 01 , h
it D1 o i t , t-
t t r C01 • i re t low rom -
t pti . i ise, r t t t 
t r C02 , h i t
t t t r ust e ai 02 t
a anner, i t
t t r a patibili i i -
a . 
it z rds, c11 d 
 
164
either or is applied to the check circuit* no check output 
hit is sensitized to any one of the hits in set S. Similarly
given and either or Dj as inputs* no check output can he
sensitized to a data input hit which is not in set S, due to the 
disjoint error assumptiouc The distance 3 restriction in the 
theorem statement assures that such a circuit is feasible.
Therefore, any fault on a single data input hit results in 
the correct check output vector and either the correct or the 
incorrect data output vector. If the data output vector is 
correct, then the correct codeword is produced. If an incorrect 
data output vector is produced, then it is not compatible with 
the correct check output vector. Therefore, the circuit is 
fault-secure.
Theorem 4 shows that method C may always he used provided that the 
minimum Hamming distance in any compatible set is at least 3. In the 
proof, it is required that the check circuit he designed so that in
those cases where there is insufficient information in the check input 
vector to calculate the data input vector, a single fault on one of the
data input hits would not change the check output vector. By making
this requirement, we are insuring that a fault on one of the data input 
bits does not cause the check output vector to he incorrect. If we do 
not design the check circuit in this manner, then a fault on one of the 
data input hits may cause both the data and check vectors to he
incorrect. This in turn may lead to an incorrect codeword output and 
thus a violation of the fault-secure property.
D1 Di l it, t t
bit l si z bi t . i il rl  
C i1 0,. 2, ts, t t b  
i z t t bit hi t t ,
i t pti n. a c io
heorem atem t t it l . 
heref re, l n• l t t bit l
r t t t t r r t
r t t t t ct r. t t t t r
rr ct, h r t r e . r t
t t t t r ced, t pati l it
r t t t ct r. eref re, it
 ar . 
 
e s t et a b t
i im i a pati l t t .
f, i that the check ircuit be designed so that in 
h her ff t n or t t
t r l t o input ector,  i le ult n ne  
t t bi oul not change the check output vec o . By aki  
ent, t nl t t
i t t t t b rr ct.
t it anner, h l
t t bi a cause both the data and check vectors b  
rr ct. i a e r t r t t
h i l o -se ert . 
165
In order to satisfy the fault-secure property under the disjoint 
error assumption, the check circuitry must be designed so that no single 
fault on one of the data input bits changes the check output vector. 
Unfortunately, this type of check circuit creates a testability problem 
if the desire is to implement a circuit which satisfies the totally 
self-checking goal.
Theorem 5: If a circuit cannot be implemented using method B, 
then no implementation using method C satisfies the totally 
self-checking goal with respect to the simplified indeterminate 
fault model.
Proof: If the checker circuit is implemented so that a single 
fault on one of the data input bits may change the check output 
vector, then a single failure on one of the data input lines can 
result in an incorrect data output and check output vector. By 
the disjoint error assumption, the circuit is not fault-secure. 
Therefore, the circuit cannot satisfy the totally self-checking 
goal.
Consider an implementation using method C where no single 
bit input fault alters the check output vector. We now prove 
the the theorem by constructing a sequence of faults on the data 
input bits for which the implementation violates the self­
checking goal . Since the data portion of the circuit may con­
tain redundancy, we consider faults only on irredundant data in­
put bits, i.e., each such fault will affect the data output vec­
tor for some data input vector. Let the first fault in the se­
quence occur on one of the data input bits after they have been
 
or -se ert er i t
mnpti ,  i ust t l
lt t t i s t t ct r.
nfortunatel . it il e
si l ent it hi t
f-che al. 
em i it t l ent et , 
l entat et  t
f-chec al • t pli i t
l odel  
~: er it l ent  t l
lt t i a  t t
ct r. l u t t n
lt r t t t t t t ct r.
i t pti , it t  nr .
eref re, it t t f-ch  
, 
onsi er l entati et  her l
it t l l t t ct r. e  
heorem st t lt t
t i hi l entat i l l -
al t rt f it a -
cy. si r l l rredu t t -
t it .  .•  l ill t t t t -
o t t ct r. et lt -
r t t i h  
fanned out so that the fault only affects the check circuit•
This fault is undetectable. Let the next fault occur occur on 
another data input bit. If this fault causes the check output 
vector to change * then let the fault occur before the data input 
bits are fanned out so that it affects both the data and check 
portions of the circuit. Otherwise, let the fault occur after 
the data input bit is fanned out so that it affects only the 
check portion of the circuit and is therefore undetectable. 
Continue this process until a fault finally causes an incorrect 
check output vector. Note that such a fault must eventually be 
encountered since otherwise all data input bits would be redun­
dant in the checker portion of the circuit and we would have a 
method B implementation contrary to the theorem hypothesis. We 
now have a sequence of undetectable faults followed by a data 
input bit fault that alters the check output vector. This last 
fault must also alter the data output vector for some choice of 
input vector. It therefore causes an incorrect data output vec­
tor and an incorrect check output vector. From the disjoint er­
ror assumption, the circuit is not fault-secure for this se­
quence of faults. Therefore, the circuit cannot satisfy the to­
tally self-checking goal.
From this discussion, several conclusions can be drawn. Method A 
must be used if the circuit receives unencoded inputs. Method A, how­
ever, does not protect against failures that occur on data inputs. When 
the circuit receives encoded inputs, method B is the method of choice.
166
' I 
a- t t lt l t . 
i lt i detectable. et itt l r r
t r t t it. lt t t
t e, lt r t t
i a t t t t t
rt it. t enris . l r
t t it t t t l
rt it a d l .
onti til lt rr t
t t ct r. ot t l ust t al  
t i t t i oul -
t r rt it oul
et  Jl.ple ontati t heorem othesis.
et ct l l low t
t it lt t o t t ct r. i t
l ust  t t t r o i
t ct r. s r t t t t -
r t t t t r. o i t -
pti n. it t -sec -
lt . herefore, it t -
f-chec al. 
,, 
o i . l l n. et  
ust o it i n o ts. et . -
er. s t r t t i st lu t r ts. he
it ts, et  et i e. 
I 
l 
Method B is relatively simple to implement and when feasible, always 
provides a fanlt-secnre implementation. Unfortunately, in some cases, 
there is not enough input information in the check input vector to com­
pute the check output vector. In such cases, method B cannot be used. 
Method C usually requires more logic than either method A or method B. 
Method C provides a fault-secure implementation provided that the 
minimum distance of all compatible data input sets is at least 3. 
Unfortunately, we have shown in Theorem 5, that if method B is not 
feasible for a given function and input encoding, then no method C 
implementation can satisfy the totally self-checking goal. Therefore, 
if the desire is to construct circuits which satisfy the totally self­
checking goal, then only method B merits further consideration.
In Theorem 4, we required that the Hamming distance between any two 
data input vectors be at least 3. This requirement is actually more 
restrictive than necessary. In particular, if a compatible set of data 
input vectors all produce data outputs which are all in the same set of 
compatible output vectors, then it is unnecessary to use the data input 
vector to calculate the check output vectors. In this case, the check 
input vector implies the check output vector. Consequently, for this 
check input vector, none of the check output bits is a function of any 
of the data input bits. If the check circuit has no static hazards, 
then none of the check output bits is sensitized to any of the data 
input bits. Therefore, it is only necessary that those compatible data 
input vectors which may produce data output vectors in different compa­
tible output sets must have a minimum Hamming distance greater than 2.
I 167 
et iv l l ent i l ,
i u -secu pl entati n. nfort natel , o s,
t t n or at t t -
t t t t r. s, et t ,
et  .all i or i h 111etbo  111etho .
et i -sec l entati t
i im a l pati l t t .
nfort natel , ho e S, t et t
o t i , h et  
l entat f-che al. heref re,
si st t it hi l -
i al, l et erit si erat . 
e , i t i a
t t t r t . i re t l or
iv ss r . rt l r, pati l t t
t t r l t t t hi l a t
pati l t t t rs, h eces t n t
t r l t t t ct rs. ,
n t t r pli t t t r. onsequently,
t ct r, t t i t
t t it . it ar s,
t t i si z t
t it . herefore, l t h pati l t
t t r hi a t t t t r t pa-
t t t ust  i im R i a h . 
168
Figure 4.12 demonstrates three different methods of generating the 
check output vector. A fourth method exists where both the check output 
and data output vectors are calculated using both the check input and 
the data input vectors. This method is not considered since it violates 
the spirit of a separable implementation. One of the advantages of a 
separable implementation is that the data portion of the circuit is 
unchanged by the coding function. If the data output vector were com­
puted from both the data input and the check input vectors* the data 
portion of the circuit would be changed.
4.4. CED Under & General Single F&iJjajfi I-EUlsle.CTiafttg Ew lE Model
The simplified indeterminate fault model is adequate for describing 
failures that only affect a single line. Unfortunately* the simplified 
indeterminate fault model fails to take into account the behavior of 
bridging failures. For this reason* we propose a new fault model which 
includes bridging failures.
4.4-I* Fault AgaaaiLfciflna PiP-g.exli-C.g
The a£ft££fil ai&glfirl&ilm a  iMe-terminate fault Hjfljtel assumes that 
any physical failure that causes a short between two nodes causes the 
value on the two nodes to become ternary u values. Any physical failure 
which affects a single node causes the value on the node to become a 
ternary u value.
*
Only bridging failures between two nodes are considered. The proba­
bility that a single failure causes more than 2 lines to become shorted 
is quite low.
 
i r onstrat t et er t o
o t t oct r.  et o i t her t t t
t t t t r l t o t
t t ct rs. i et t si i l t  
irit o r l l entati . o o t • f
l pl nentat b t t rt it h 
i t . t ti, t t or -
t rom t t t ho t ct rs, o t
rt o it oul . 
,!.J. ~ nd; A cnar l i s ailure ndetennina o f&.Jl.ll~ 
o pli n i t lt odel at r 1
t l t . nfort natel , o pli
i t l 111odel i t. t o a i r
. r , l odel hi
s. 
i-! 1, lt Model ssumptions .IA4 roperties 
senera sin l~fa lure nde i at l mode  lllllo t
si al lu t au1111s rt s
• Y l s. ysi al
hi t l oin
 l o, 
• nl d lu si r . -
i t l or in r
it o . 
169
Clearly, the general single-failure indeterminate fault model and 
the simplified indeterminate fault model are identical for physical 
failures that affect only a single node. The difference is that the 
general single-failure indeterminate fault model is also able to model 
failures which cause two nodes to become shorted together. We assume 
that a bridging failure always causes both nodes to assume a ternary u 
value. It can be argued that if both lines have the same Boolean value, 
the short has no effect. In most cases, this is true. For some types 
of circuits which are very sensitive to changes in circuit parameters 
(i.e., certain classes of dynamic circuits), a short between two nodes 
may definitely affect circuit operation, even when they would have the 
same Boolean value under no fault. For other classes of circuits it is 
also possible to make the assumption that both nodes assume a u value 
only when the nodes have different Boolean logic values under no 
failure. In this case, any time both lines have the same value, we 
still must consider the effect of single faults at each node.
Most of the theorems and procedures which were developed in Section 
4.3 for the simplified indeterminate fault model have an analog for the 
general single-failure indeterminate fault model. When considering a 
theorem for the general single-failure indeterminate fault model, which 
is analogous to a theorem we have considered for the simplified indeter­
minate fault model, we use a after the theorem's number to indicate
that the theorem applies to the general single-failure indeterminate
fault model.
 
learl , er l l i at lt odel
pli i t lt odel ti l ysi al
u t t l l e. i ere t h
er l l - i n i t lt odel l odel 
 n hi r t er, e su
t t su  
l . t t in a ool l e,
rt t. ost s, . r o
it hi sit it eter
., i it ), rt
a fi i t it erati , he oul
 ool er lt. r r it  
ssi l a u pti t t n l  
h s i t ool i r
. i . im t n l , e
ust si r t l l e. 
ost h e r hi er t  
pli n i at lt odel
s r l l - i n i t lt odel. he si
he rem eral n i t lt odel. hi  
s e si pli t r-
i at lt odel,  11 • " ' ber i t  
t heorem li er l l -f i  nr i at  
lt odel. 
170
When more than one variable of a logic function may be an indeter­
minate value* the Boolean difference is no longer satisfactory for 
determining whether an output is sensitized to n values on several 
inputs. Considers
X - .c* JLp* *p+l» Xjj)
where X represents an input vector to some combinational function f. If 
function f has no p-variable logic hazards* then the output of f is sen-
sitized to ternary n values on (ij, .... ^ ) if and only lf there exists 
both l #s and 0*s specified for f within the 2P cells of the sub-cube
1
X^p+1 » •••» Xjj) . When a p-variable logic hazard exists* then the output 
of f is sensitized* even if the 2P cells of the sub-cube (x _ \
p + 1 » •  • • *
are specified as all l ps or all 0's.
In the general single-failure indeterminate fault model* we assume 
that any two nodes in the circuit can be shorted together. In practice* 
only lines which are in close proximity to one another can become 
shorted. Unfortunately, unless the circuit layout is available* there 
is no way of knowing which lines are near each other. For this reason, i
we assume that with one exception* any line in the circuit may become
shorted to any other line in the circuit. The one exception concerns |
■ ! I
shorts between the data and check portions of the circuit. We assume 
that the circuit is designed so that no shorts can occur between the [
check circuit and data circuit. Presumably* a design rule can be speci­
fied so that if two lines are separated by some distance* no short can 
occur between the two nodes. This restriction insures that no single 
short will cause an error to occur in both the data and check output
 
he or r l t o a t r-
inat l . ool fere o r sfa
ermi11i 1 het er t t si z u r l
ts. onsider: 
• (x1, ••• , Z • J:p+l • • ••, Zn_  
her  t t t r o binati al ot
n t  ri l a ar s. ho t t f -
ize e n u x1 , ••• , ;.> i l1ta
t 's ' i  i l  
(x ) l • • • • • %it • he ri l i ist , h o t t
f sit , ho l o o , l • • ¼ • 
i l 's O's. 
er l l i n i t l odel, su
t w it r t er. cti ,
l in hi ~ si it t o
rt . nfort natel , l it a t il l ,
i hi in r t r. r , 
t i pti , in ho it a  
r ino it. t r
rt w en. o t rt it. e  Wile
t it g t rt r o
it t it. ably, i-
e t in , rt
r es. i c io t l
rt ill o :ur t t t t 
171
vectors. In many cases, a circuit layout is such that inputs and out­
puts are on opposite sides of the circuit. If this is the case, the 
probability of a short between an input node and an output nodes is very 
low. We assume that input-output shorts may occur. If enough informa­
tion is known about the layout, it may be desirable to assume that 
input-output shorts do not occur. All of the results of this section 
may be easily modified if desired for the assumption that input-output 
shorts do not occur.
We are now ready to begin reconsideration of the theorems which we 
have already developed for the simplified indeterminate fault model.
Hypothesis*: Indeterminate values at a pair of nodes is the most 
general model for a bridging fault.
Theorem 2*: For any switching function, an implementation exists 
in which all failures allowed by the general single-failure in­
determinate fault model may be modeled as ternary u values on 
one or two input lines, ternary u values on one or two output 
lines, or ternary u values on a single input line and a single 
output line.
Proof: If one of the two u values behaves as the correct logic 
value, then this situation is equivalent to a single u value on 
an input or output. In the proof of Theorem 2, we showed that a 
single u value on an input or output could model all failures in 
the simplified indeterminate fault model. Therefore, we only 
need to consider bridging failures. Assume the function is im­
plemented in the form of Figure 4.6, i.e., no output bits share
logic. Clearly, any failure which causes a short between two 
input lines may be modeled as a pair of indeterminate values on
 
ct rs. a , it a t t t t-
t  osit it. ,
r abili rt t t t
. e su t t t t rt a r . nfor -
io t ut, a si l sWtl t
t t t rt t r. ll f lt
a i odifi pti t t t t
rt t ur. 
e i si r t h e hi
rea pli n i t lt odel. 
ypothesi • t nni t t i ost
eral odel lt. 
e 1 • : r i t . l entati  i t
hi l u low eral l - i -
i t lt odel a odel t1 l
t 1 s, l w t t
. l t i l
t t 1 , 
~:  l r t i
l . tu o i al t l
t t ut. f e , t
l n l t t t l odel l u
pli i t lt odel. herefore,
si r . s t -
e e t rm i r , ,  .. t t it  r  
i , l arl , hi rt
t in a  odel i i t l  
the two shorted input lines. A short between two nodes within 
the logic for one output bit only affects that output. This 
condition may be represented as a single ternary u value on the 
affected output. Any short that occurs between two nodes asso­
ciated with two distinct output bits# can at most affect the two 
output bits. Thus# such faults may be modeled as a pair of ter­
nary u values on these outputs. A short that occurs between an 
input node and an output node may be modeled as the as a ternary 
u value on the affected input and a ternary u value on the af­
fected output. Therefore# for this implementation, all failures 
allowed by the general single-failure indeterminate fault model 
may be modeled as ternary u values on at most two input lines# 
two output lines, or one of each.
Just as was the case for the simplified indeterminate fault model, 
Theorem 2* significantly reduces the number of faults which must be con­
sidered for implementations of the form of Figure 4.6. If shorts 
between an input node and an output node are not being considered, then 
only pairs of ternary u values on input nodes and pairs of ternary u 
values on output nodes need to be considered.
Theorem 3.*: Functional duplication provides an implementation 
which satisfies the totally self-checking goal with respect to 
the general single-failure indeterminate fault model for any 
switching function.
Proof: Based on the assumption that the circuit can be designed 
such that no node in the data portion of the circuit can be
172
r t i .  rt o w it i
i t t it l t t t ut. i
dit a l ern l
• t ut. rt t r • -
it w i t t t it . ost t
t t it . us. l a odel i fte -
l t uts. rt t r
t t t a odel e
11 o t 11 ho -
t ut. eref re, l entati . l u
low a er l l - i n i t l odel
a o odel ern ~ ost w t in ,
t t , . 
 
t o pli n i t l odel,
e • i t ber l hi ust -
l entati o orm i r  . , &
t t t t si r . h
l i  t i
l t t si r , 
em 1,• : llncti al li o i m em t& o
hi t o f-che al it t
eral l - i n i t l odel
i 11.11ctlon. 
~: as sllJllpti t it
t t rt f it  
173
shorted to a node in the check portion of the circuit* the proof 
is identical to the proof for Theorem 3.
Corollary 1 *: If a functional duplication implementation con­
tains no redundant logic (when only the input code space may he 
applied)* then it is totally self-checking.
Proof: The proof is identical to the proof for Corollary 1.
4.4.2. Economical Implementations for the General Indeterminate Fault 
Model
Once again* we are now left with the question of when, if ever, an 
implementation exists which is cheaper than functional duplication. The 
procedure that we developed for the simplified indeterminate fault model 
is directly applicable to the general single-failure indeterminate fault 
model. The only difference is that we must consider faults on a pair of 
input and output lines rather than single faults.
As an example of searching for a more economical code, consider a 
four-input, three-output function. The inputs consist of two 2-bit
numbers, X = x1Xq and y = yiyo* The output S = s2 siso is the sum of X 
and Y. Figure 4.13 shows the truth table for the function and the
result of failures on all pairs of inputs. It is assumed the function 
is implemented without any 2-variable logic hazards so that the sensi­
tized bits may be determined from the truth table. By only considering 
faults on the inputs, we have the situation where any of the correct 
output vectors, except 0, can be transformed to any other output vector. 
When 0 is the correct output vector, then any output vector may result
r rt it, f
ti l f e . 
oroll • : t l pli t pl entati  -
t l t a b
li ), h f-ch i . 
: f ti l f oroll . 
1  
i•l••· ical l entati .ill. ~ ener l n i t lt
odel 
nc i , t it est hen. er.
l entati i t hi er h t l pli t .
r t pli i t l odel
i l l er l n i at lt
odel. l fere t ust si l i
t t t in h lt . 
pl or ical e, si r
nr nt. t t . t si t it 
ber 5 •   xo Y  YlYO. t t  1 tml  
. r s t  
lt u l i ts. sstlJlle
l ent it t ri l t i-
iz i a r i rom r l , si
l ts. tu io her r t
t t ct rs. t O. ransform t t ct r.
he O rr t t t t r, t t t a  lt 
174
S OUTPUT FOR INDETERMINATE I
J_______  BRIDGING FAULT IN___________J
i1 FUNCTION *lx0 *lyl xly0 x0yl W o  1 W o  1
j * , xo y 1 y o
s s s S s s s
■I . A.
1 o 0 0 0 000 Ouu uuO Ouu Ouu Ouu Ouu
! o 0 0 1 001 uuu uul Ouu uuu Ouu Ouu
i o 0 1 0 010 uuu uuO uuu Ouu uuu Ouu
S o 0 1 1 Oil uuu 1 uul uuu uuu uuu Ouu
j 0 1 0 0 001 Ouu 1 uul uuu Ouu Ouu uuu
i o 1 0 1 010 uuu 1 uuO uuu uuu Ouu uuu
1 o 1 1 0 Oil uuu 1 uul uuu Ouu uuu uuu 1
! o 1 1 1 100 uuu 1 uuO uuu 1 uuu UUU UUU 1 1
1 i 0 0 0 010 Ouu 1 uuO Ouu 1 uuu uuu UUU 1 1
1 i 0 0 1 Oil uuu 1 uul 1 Ouu 1 uuu I uuu uuu 1 1
1 i 0 1 0 100 uuu 1 uuO 1 uuu 1 uuu 1 luu uuu 1 1 1
I i 0 1 1 101 uuu 1 uul 1 uuu 1 uuu 1 luu 1 UUU I 1 t
i i 1 0 0 Oil Ouu 1 uul 1 uuu 1 uuu 1 uuu I uuu 1 1 1
1 1 1 0 1 100 uuu 1 uuO 1 uuu 1 uuu I uuu 1 uuu 1 1 1
1 1 1 1 1 1 0 1 101 uuu 1 uul 1 uuu 1 uuu 1 luu 1 UUU 1 1 1
L i — ll 1 1— 1 1 i 110 _JiSS— 1. .too— 1 uim— 1 uuu__ 1 luu__1 ..Ujrs__ l
Figure 4.13 Two-Bit Adder Example
174 
I U'n>UT F il I I TE I 
I RI I  F ]ZLT I  I i CTI  I %1%0 I z:1y1 1:l O oY1 I 1:070 Y1Yo I l 
%1 %0 y1 Yo   I  s  I s s I 
0 0    uu I n  un uu I uu Ouu 
0 0 0   111111 I nul u   I uu Ouu 
0 0 1   n  I 11110 n  uu I UllU Ou11 
0 0 1  011 1l1lll l n  I uuu Ouu 
0 1    u  l  uu I uu 111111 
0 1 0     n   I uu nuu 
0 1 1  011 1l111l nl UllU uu I 111111 U11U 
0 1 1    nn  1l11U n n I a.nu nun 
1 0 0   n    I nun nun 
1 0   011  nl 
'
 n  l unn unn 
1 0    n   I   I lnu QUU 
1 0 1 l   n l I U11U 1111'11 I luu uuu 
1 1 0  011  n l I 111111 UU.11 i uun uuu 
1 1 0 1   uO I UllU n n I uuu u1111 
1 1 1 0 1  U1111  J 111111  I lu  uun 
1 1 1 l 1  ]l]l]I :imO I llllll JUl]I I 131]1 JUlll 
Figure •.13 Tw t A r Ezua le . 
175
except 5 and 7. Since output vector 7 is never a legal output vector, 
it may be ignored. When output faults are considered, it is possible 
for the correct output vector 0 to be transformed into output vector 5. 
Therefore, any of the correct output vectors can be transformed by a 
modeled failure into any of the other legal output vectors. Clearly, 
functional duplication is the cheapest code for this example if the gen­
eral single-fault indeterminate fault model is used.
Figure 4.14 shows the behavior of the outputs under input-output 
shorts. In order to consider the effects of input-output shorts, it is 
necessary to consider a ternary u value on one input node and one output 
node simultaneously. The procedure of Section 4.3.3 considers the 
effect of ternary u values on all single input nodes. The first four 
columns of Figure 4.14 show the effect of faults on the input nodes. If 
an indeterminate fault simultaneously occurs on an output node, then the 
resulting output vector may be altered in at most one additional bit 
position. Therefore, the output vector resulting from a ternary u value 
at both an input node and an output node, as shown in the output map of 
Figure 4.14, is either:
(1) the correct output vector
(2) one of the incorrect output vectors that can result from a 
fault on an input node
(3) other output vectors which are a Hamming distance of 1 from 
one of the output vectors in (1) or (2).
The procedure for finding codes that are more economical than func­
tional duplication and that have implementations that are fault-secure
S 
t . i t t t r r l t t ct r,
a r . he t t l si r , ssi l
r t t t t r Ot ransform t t t r .
heref re, r t t t t r ransform
odel r l t t t r , l arl ,
t l li o est pl -
l l - lt i t l odel . 
i r  . s avi r t t er n t t t
rts. r si r t t t t rts.
si r t t t
lt usl . r t . si r
t n l l t es.
u s i r , t l t es.
i at l l sl r t t e.
l n t t t r a e t ost it l it
sit . herefore, t t t l n rom  l
t t nt t e, t t a
i r . , r: 
r t t t t r 
r t t t t r t lt rom
lt t  
t t t r hi a i a rom
t t t r ). 
r n t or ical -
io l pli t t l entati t -se  
116
OUTPUT FOR
INDETERMINATE FAULT IN
FUNCTION X1 xo *0
X *0 y1 yn s s
s S s OUTPUT MAP
0 0 0 0 000 OuO oou-! OuO OOu 0 | 1,2 1 3,4,5,6
0 0 0 1 001 Oul Ouu Oul OOu 1 I 0,2,3 1 4,5,6,7
o 0 1 0 010 uuO Olu OuO Olu 2 1 0,3,4,6 1 1,5,7
o 0 1 1 Oil uul uuu Oul Oul 3 1 0,1,2,4,5,6,7 1
o 1 0 0 001 Oul OOu Oul Ouu 1 | 0,2,3 1 4,5,6,7
o 1 0 1 010 uuO Ouu uuO Ouu 2 1 0,1,3,4,6 1 5,7
o 1 . 1 0 Oil uul Olu Oul uuu 3 1 0,1,2,4,5,6,7 j
o 1 1 1 100 luQ uuu uuO uuu 4 1 0,1,2,3,5,6,7 1
1 0 0 0 010 OuO Olu uuO Olu I 2 1 0,3,4,6 1 1,5,7
1 0 0 1 Oil Oul 1 uuu I uul Olu 3 1 0,1,2,4,5,6,7 1
1 o 1 0 100 uuO 1 lOu 1 uuO 1 lOu 4 I 0,2,4,5,6 | 1,3,7
1 0 1 1 101 uul 1 luu 1 uul 1 lOu | 5 1 1,3,4,6,7 1 0,2
1 1 0 0 Oil Oul 1 Olu 1 uul 1 uuu j 3 1 0,1,2,4,5,6,7 1
1 1 1 0 1 100 uuO 1 uuu 1 uul 1 uuu I 4 1 0,1,2,3,5,6,7 1
1 1 1 1 0 101 uul 1 lOu 1 luu 1 luu 1 5 1 1,2,3,4,6,7 1 0
1 Ll_Li— 1 1 110 1 1 IBB— 1 luu 1 luu 1 6 1 4,5 j_J________ 1 Ofl|2,3
Figure 4„14 Behavior of Input-Output Faults in Two-Bit Adder
11, 
I TP T F il 
I I B JII  L   
F CTI  l % f XO I yl I YQ 
~ %0 yl Yo   I  I s I  OUlPUT lfAP 0 0 0 0  u0 OO  u0 00  0 I 1 , 2 3,4,S ,6 
0 0 0   ul  I ul I 0011 1 I 0,2 , 3 4,5,6,7 
0 0 1 0   01  I 0 I l  2 I 0,3,4,6 1,5.7 
0 0  1 011 l l1U1l I ul I ul 3 
' 
0 , 1,2 , 4 , S,6,7 
0 1 0 0  ul 0  I ul I  1 I 0,2,3 4,5,6,7 
0 1 0 1    I  I 0  2 I 0,1,3,4.6 5,7 
0 1 1 0 011 l l  I ul I Ullll 3 I 0.1,2 , 4,5,6,7 
0 1 1 1  lu0  I  I Ullll 4 I 0,1,2 , 3,5,6,7 
1 0 0 0  0uO l  I  I 0  2 I 0,3,4,6 1,5,7 
1 0 0 1 011 t tlllU I 11111 I l  3 I 0,1.2,4,5,6,7 
1 0 1 0 1  0 l0  1 uO I 1011 4 I 0,2,4,S,6 1,3,7 
1 0 1 1   lu  I  I l0u 5 I 1,3,4,6 , 7 0,2 
1 1 0 0 011  u I  I 111111 3 I 0,1,2,4,5 , 6,7 
1 1 0 1 10  Ullll I u  I 4 I 0 ,1,2,3,5,6,7 
1 1 1 0 10  10  r 11111 I 5 I 1,2 , 3,4,6,7 0 
1 1 1 1 10 1 I 1 I 6 4 5 0 
Figure 4.14 Beh ior of Input-Output Fa ts in Two-Bi t Adder . 
177
with respect to the general single-failure indeterminate fault model may 
now be summarized as follows:
(1) Construct a truth table for the desired switching function.
This function is implemented by the data portion of the circuit.
(2) For each possible data input vector, determine the possible 
incorrect data output vectors that may result from a fault on a 
pair of inputs.
(3) Summarize the results from step 2 to obtain a list of each 
correct data output vector and the incorrect data output vectors 
that may result from a pair of input faults.
(4) Update the list from step 3 to include the effects of faults 
on a pair of output lines.
(5) Update the list from step 4 to include the effects of faults 
on an input line and an output line simultaneously.
(6) Determine the minimum number of sets of compatible data out­
put vectors so that each output vector is included in exactly 
one set.
(7) The minimum number of check bits is the smallest integer 
which is greater than or equal to the log2 of the minimum number 
of compatible sets.
So far, we assumed that any two nodes in the data portion of the 
circuit may become shorted together. If it is known a priori that two 
particular inputs cannot become shorted, then this fault need not be 
considered when determining the effects of faults on output vectors. 
Likewise, if it is known that two outputs (and the logic which computes 
these outputs) cannot be shorted together or that an input node cannot 
be shorted to an output node, then these faults do not have to be
I 
 
it t eral l - i n i t l odel a
 ari lo s: 
nst t  tru h ta f th re s i c n fu  
i t l ent r it. 
F  each p  d a inp  v . d term n  the p e 
r t t t t t t lt rom lt
i ts. 
S ari th re from step  to a li ea
rr t t t t r t t t t r
t a lt rom i t lt . 
at  the lis from step 3 to includ  th  o  fa s 
i t t i . 
pdat rom t l
t in t t in lt usl . 
et i i imu UJDber pati l t t-
t t r t nt nt t n u t
t. 
i mwn n ber f it  a lest r 
hi t al o 2 i imu berpati l t . 
, sum t w t rt
it a r t er. ri t w
rt l t t o rt , l  
si h i t l t t t r .
i ise, t t t hi putes
t ts) t r t t t
r t t e, l t  
178
considered either. It is only necessary to consider the effects of 
faults which may actually occur.
This procedure is based on the assumption that any short between 
two nodes results in a ternary u value on both nodes regardless of what 
the original logic values of the shorted nodes would be under no fault. 
For some types of circuits* particularly static circuits* this assump­
tion is overly pessimistic. For such circuits* a more reasonable 
assumption is that a bridging failure between two nodes* causes a ter­
nary u value at the node only if the original values at the nodes are 
different. The procedure for finding more economical codes* can easily 
be modified to work with this assumption. The only difference is that 
if the two nodes have the same value* then the short has no effect on 
circuit operation. In those cases where the failed nodes have different 
values* the above procedure is unchanged. In the remaining cases where 
the nodes have the same value* the effect of a single ternary u value on 
each of the two nodes individually must be considered (i.e.* the effect 
of a single ternary u value needs to be considered for each input vector 
only for nodes whose bridging faults have no effect).
4.4.3. Check Vector Generation
The general single-failure indeterminate fault model presents the 
same problems for check vector generation as in the simplified indeter­
minate fault model. The three methods presented in Figure 4.12 are 
still possible candidates for generating the check vector. Method A is 
the method to use when the circuit receives unencoded input data. 
Method B is the method to use when there is enough information in the
 
si r. h l si r o t
! l a hi a l ur, 
hi  r a o s1Ullpti t rt
l e t l hat
o l i l r oul or lt.
r o i , rt l it , uum -
io erl ssi isti . r it , or l
pti t es, -
l l o a l l ho
i t. r in n or ical es, i
odifi or it mnpti . l fere t
l o a l , rt t
it erati . her le i t
l s, r 1 ed. ai i her
u l , t l
l ust si ., o t
l  l si t t
l hos 1 1 l t), 
!,J.i. ~ ect x enerat  
eral l - i i at lt odel t
e t r er t  o pli t r-
i at l odel. o et s t i r
ssi l i t er t ct r. et  
et . o it t t .
et et n or t  
179
check input vector to calculate the check output vector. When the check 
input vector contains insufficient information, method C must be used.
Theorem 4*: Method C may always be used to provide a fault- 
secure implementation of the check portion of the circuit with 
respect to the general single-failure indeterminate fault model 
provided that the Hamming distance between any two data input 
vectors in a compatible set is at least 5.
Proof: The proof is identical to the proof for Theorem 4 except 
that the check circuit must be specified so that no pair of 
faults on the data input lines causes the check output vector to 
change. This requirement can always be met when the minimum 
Hamming distance between any two data input vectors in a compa­
tible set is at least 5.
Theorem 5*: If a circuit cannot be implemented using method B, 
then no implementation using method C satisfies the totally 
self-checking goal with respect to the general single-failure 
indeterminate fault model.
Proof: The proof is identical to the proof for Theorem 5 except 
that we must consider a pair of faults on data input lines.
4.5.. Checker Requirements
It was stated in Section 4.2.1 that a totally self-checking checker 
must be both totally self-checking and code disjoint. As pointed out by 
Smith [5], it is not actually necessary for the checker to satisfy the 
fault-secure property. A checker which is self-testing and code dis­
joint also operates satisfactorily. The fault-secure property is not 
necessary since what is important is whether the output from the circuit
 
n t t r l t  t t ct r. he  
t t r t i f t n ati , et  ust . 
em ,! : et  a n i lt
l entati rt it it
t l l u n i t l odel
i t i a •  & t
t r pati l t t t S. 
~: f ti l f e t 
t it ust i t i
l t t in t t t r
e. his e t et inimWD
BJ11min i a t t r a-
t S. 
em 1 : it t l ent et ,
l entati et  ti
f-chec al i t eral l i
i t l odel. 
~: f t l f e t
t ust si r i l t t . 
! -1• ec er e ui ent  
as t . t f-che er 
ust t f-che i i t. i t t
it SJ, t t l r
-se ert ,  r hi in i -
t er t t ri . -sec ert t
hat port t het t t rom it 
180
being checked is a codeword or a non-codeword. If a circuit is totally 
self-checking and code disjoint* then as long as the checker is operat­
ing properly* it will always produce a non-codeword output if the output 
from the circuit being checked is a non-codeword. When the checker 
fails* then the totally self-checking property insures that there is 
some test to detect the failure. As long as all modeled failures are 
testable* it is not necessary for the checker output vector to be the 
correct codeword output under all possible faults and checker input vec­
tors. Therefore, the checker does not need to be fault-secure.
Checkers for indeterminate faults are much easier to design if they 
do not have to satisfy the fault-secure property. We have assumed that 
checkers are unable to detect indeterminate values. Therefore* a 
checker cannot be code disjoint with respect to indeterminate failures. 
Checkers should* however, be code disjoint with respect to vectors which 
contain only Boolean values. If the input to the checker is a 
potential-codeword, then the precise response of the checker becomes 
non-deterministic.
In our design methodology, checker input vectors come from the out­
puts of the flip-flops which separate blocks of combinational logic. If 
a failure occurs inside the block of logic* then the flip-flops should 
with very high probability have a legal logic value output. The proba­
bility that more than one flip-flop passes an indeterminate input 
through to its output should thus be negligible with respect to the pro­
bability that some multiple or other unmodeled failure occurs. However* 
we do need to be concerned about the checker receiving an indeterminate
 
ol s o or  ord. it
f-cho  i i t. o o r erat-
erl , ill s or t t t t 
rom it a - ord. Y o o o er
i , f-che ert t i
u t t . s o l odel lu
l . l t s o er t t t
r t r t t er l ssi l l r t -
, herefore. r t l r . 
ecker n i t l lllllCh i a
t  o -se ert . e su t 
octor l t t n r i t l s. herefore, 
er t i t it t i at .
hecker ul , ever. i t i t t r hi
t l ool l s. t er
t al-code r . o i r es 
n-deter inistic. 
r 1 et dol 17, r t t r ao rom t-
t ip-flo hi r t binati al i ,
u r i . ip flo l
i abili l i t ut. -
i t or ip flo n o i at t
hro t t l li i l it t -
bili t 01ll ulti l r UJUDodele urs. o ever.
t r n i at  
181
value, for example if one of the flip-flops should fail. Thus a checker 
may experience three different conditions when a failure occurs: the
checker may receive a non-codeword with only Boolean values, the checker 
may receive a potential codeword that has exactly one indeterminate 
value (two indeterminate values due to a short if the single-failure 
indeterminate fault model is used), or the checker may receive the 
correct codeword. The first case should be detected by the checker, the 
next case is compatible with the requirements for concurrent error 
detection in the next block of combinational logic, and the last case 
involves no error.
If the checker is code disjoint with respect to Boolean values and 
self-testing with respect to indeterminate faults occurring within 
itself, it is able to respond appropriately to all three of the situa­
tions that may occur. Note that if one of the checker's inputs is
indeterminate, then the failure may or may not be detected. If it is 
not detected, the next block of combinational logic which accepts the 
output of this circuit as its input, may receive one (two if the general 
single-failure indeterminate fault model is used) incorrect input bits.
Any checker which is acceptable for the single stuck-at fault model 
is also acceptable for the simplified indeterminate fault model. Since 
the checker is self-testing with respect to single stuck-at faults it 
must also be self-testing with respect to simplified indeterminate fault 
model faults. The checker is also code disjoint for all vectors which 
only contain Boolean values. This fact implies that any non-codeword 
input vector results in a non-codeword output vector. If the input vec­
I 
 
l e. pl ip-flo l i s r
a r i t dit h rs:  
r a r it l ool l s, r
a t nti l or t ct n i t
tw ndeter111inat l rt l - i
i at lt odel ). r a
r t ord. l t ker,
t pati l i re e t curr t
t t binati al i ,
l r . 
r i t it t ool l
e in it t i at  l rr it
l . l p r pri t l u -
io t a ur, Note that if one of the checker's inputs is 
inate. h a 111ay t t t .
t t t , t binati al i hi t
t t it t, a tw er l
l - i n r i t l odel ) r t t it . 
r hi t l - t lt odel
t l pli n i t lt odel. i
r in i t l - t l
ust e in i t pli ndeter111inat lt
odel lt . r i i t l t r hi
l t Dool l s. i t lllpli t r
t t lt or t t ct r. t -
182
tor is a potential codeword# then at least one ontpnt must be sensitized 
to all of the input vector bits. If we consider all possible Boolean 
vectors that can be constructed by replacing indeterminate values in the 
potential codeword by Boolean values# exactly one of these is a codeword 
(the potential codeword's corresponding codeword). If the potential 
codeword applied to the checker's input is the corresponding codeword* 
then the checker output vector must be a codeword. When the other 
Boolean vectors (which are non-codewords) are applied* then the checker 
output vector must be a non-codeword. Therefore, at least one output 
bit of the checker must be sensitized to the checker input vector bits. 
Therefore* if the checker is adequate for single stuck-at faults* it is 
also adequate for the simplified indeterminate fault model. This result 
is quite important since a variety of checkers for different codes have 
been designed under the single stuck-at fault assumption. Techniques 
for designing checkers are discussed in [5].
Checker design is more complicated for the general single-failure 
indeterminate fault model. A line of reasoning similar to that used for 
the simplified indeterminate fault model may be used to show that a 
checker which is adequate for a double stuck-at fault model is also ade­
quate for the general single-failure indeterminate fault model. Unfor­
tunately* checker designs for a double stuck-at fault model are not 
well-known. One possible solution is to use duplicate checkers that are 
designed for single stuck-at faults. The duplicate checkers would be 
placed so as to prevent a short between nodes in two distinct checkers. 
In addition, the checker inputs should be buffered before going to each 
checker so that a bridging fault in one of the checkers will not affect
 
l t nti l ord, t t ut ut •n t o i z
l t t r it . si r l ssi l ool
t r t o at i t l
t nti l r ool l es, z t l or  
 t nti l ord' 1 esp ord). r t nti l
r l o cker' t respo11 l 1 ord,
er t t t r a t ord. he o
ool t r i - ords) li , r
t t t r ust ord. eref re, t t
it o r ust i l~ r t t it .
herefore. o er at l - t l ,
at pli t m t l odel. hi lt
i port t ri er t
er o 1l - t lt pti n. i os
er ( ) . 
ec er t or pli t er l l - i
n i t l odel.  il t
pli i t lt odel a how t
o ct r hi at l - t l odel -
t er l l i t lt odel . nfor-
t l , r l - t lt odel t
el k n. ssi l o pli t er t a
l - t lt . li t er oul
t rt i t kers.
it , r t l f  
o er t l er ill t t 
183
the other checker through the checker input lines* With such an 
approach* the outputs of at most one checker will be erroneous.
Consider a totally self-checking system constructed from a number 
of smaller totally self-checking modules* If a checker is placed at the 
output of each module* then instead of having one set of encoded lines 
which indicate the presence of an error, there are several sets of error 
indication lines (one set from each of the checkers). It is possible to 
use one global checker which checks the outputs of all the other check­
ers. The output of the global checker produces one set of error indica­
tion lines which indicate if an error has occurred anywhere in the sys­
tem .
If a failure occurs in one of the flip-flops which separates two 
blocks of combinational logic* it is possible for a simple global 
checker scheme to fail. In particular, the checker which receives the 
output of the failed flip-flop may have an indeterminate value input 
(two indeterminate value inputs if the general single-failure indeter­
minate fault model is used). The next combinational block which 
receives the output of the failed flip-flop may also receive an indeter­
minate value input (two indeterminate value inputs if the general 
single—failure indeterminate fault model is used) and in response, pro­
duce a non-codeword output. If this situation occurs, the global 
checker may receive both an indeterminate value (from the checker of the 
first block of logic) and a non-codeword input (from the checker of the 
second block of logic). We assume as above, that there is a negligible 
probability that an indeterminate input can propagate through an entire
 
r hrou r t i . it
r , t t ost  r ill s. 
onsi er f-che tem st rom ber
all r f-che odules. r
t t odule, e i t n
hi i t r, r l
o in t rom ckers). ssi l
l er hi t t r -
. t t l r t -
io in hi i t r h~r -
e  
r ip-flo hi r t
binati al i , ssi l l l
er che il rt l r, r hi
t t e ip-flo a nni t t
 i at t eral ~  i  t -
inat lt odel t binati al hi
t t e ip-flo a t -
i at t tw n i t t er l
l - i t l odel ) , -
or t ut. tu io urs, l
er a t i at  l from r
l  i or t from r
f i ). u e, t li i l
abili t i t t at hro t  
184
logic block and its output flip-flops. If the second logic block pro­
duces a Boolean-valued non-codeword* then the global checker must indi­
cate an error if the system is to operate appropriately.
However, because of the presence of an indeterminate value output 
from the first checker, it is possible for a codeword to be produced by 
the global checker due to the fact that the indeterminate value can pro­
pagate through one or more stages of the global checker. In such a 
case, it is possible that no error would be indicated by the global 
checker even though the second block checker output indicates an error. 
To make this possibility extremely unlikely, flip-flops should separate 
the outputs of the block checkers from the inputs of the global check­
ers. In this manner, any indeterminate values produced by one of the 
block checkers should become legal logic values before they are 
presented to the global checker.
1,,f 
i t t ip-floPS• o i -
• ool a - l ord, l r ust i-
t tem r t ropriatel . 
B evor. t i t t t
rom ker. ssi l r
l r t t n erm t -
at h o or l er. J1
u . u o t oul l
o er h r t t t r,
a ssibili re l li l , ip flo l r t
~t t r rom t l -
. anner. erm t l o
er l o l i r h
t l er . 
185
CHAPTER 5 
Conclusion
5.1. Evaluation o£ Fault Model
Two models for concurrent error detection are defined in Chapter 4: 
the simplified indeterminate fault model and the general single failure 
indeterminate fault model. These fault models are based on 
indeterminate-type faults. We are now in a position to evaluate these 
fault models by comparing them to the traditional fault models using the 
criteria proposed in Chapter 1.
5.1..,1. Fault Model Accuracy
The indeterminate-type fault is based on the analyses of Chapters 2 
and 3. This analysis showed that when MOS logic circuits fail, they may 
produce outputs that are not legal logic values. Traditional fault 
models rely on faults that may be represented using Boolean algebra 
(i.e., stuck-at faults, wired logic faults, etc.). Unfortunately, these 
traditional models are not able to represent many of the types of 
anomalous behavior that we have discussed in Chapter 3.
Historically, faults and tests for faults have been divided into 
two broad classes: logical (or static) and parametric (or dynamic). 
Logical faults are defined by Breuer and Friedman [64] as those faults 
that change the logical behavior of some element or signal. Parametric
faults are considered to be those faults that cannot be modeled as
 
ncl si  
~ -l. · al ati  21. ~ 2.d.tl 
odel curr t t n hapt r :
pli i t lt odel er l u
n i t l odel. l odel
n i at lt . e si t
l odel pari hem r io l lt odel
i ~ hapt r . 
~ - -l , ault~ ccur v 
i t lt l hapter
, i l si t i i il. a
t t t t l
odels l t a
i l s. r it l l
ool
., - t l , i lt . . . nfort natel .
io l odel t l t a
al s vi r t hapter . 
ist ri all , l l
s: i l etri ic).
gical l Dr er iedm ( ] l
t i l i r o e t al. r etri  
n  t l t t odel  
I t t
logical faults. There is quite a bit of ambiguity in such a classifica­
tion of faults. Beh et al. [65] define static quality as
the occurrence of defects that if present would most certainly 
cause a circuit failure in all systems applications if exer­
cised.
Dynamic quality is defined as
the occurrence of defects that if present may possibly cause a 
circuit failure in some or all system applications if exercised.
Perhaps the most reasonable definition is that logical (static) faults 
alter DC behavior while dynamic (parametric) faults alter behavior of 
the circuits at higher clock rates. From such definitions, it is hard 
to state definitively that a specific failure results in one type of 
fault or the other. In fact# it is difficult to state that a given 
behavior should be classified as a logical (static) or parametric 
(dynamic) fault. Clearly# most of the traditional fault models are 
intended to address logical fault types.
When concurrent error detection is incorporated in a system, the 
goal is to detect errors when they occur. Whether a fault is logical or 
parametric is of little concern to the end user who has paid a substan­
tial premium for the concurrent error detection capability. To the end 
user, it is only important that the system detect errors in a timely 
fashion.
Many of the traditional fault models are special cases of the 
indeterminate fault models of Chapter 4. The single stuck-at fault 
model and stuck-open fault model are special cases of both the simpli­
fied indeterminate fault model and the general single failure
18' 
1i l lt,. er 1 i it lllllbiguit i -
io lt . , ( ) ali 1 
r f f t t t oul ost
it u l te li t • n: r-
. 
yna i ali fi ed•• 
r f t t a u
it o l tem li r i . 
ost l fi i o t i l l
i r hil i etri ) l avi r
it t 1 r t s. o fi i s.
fi i v t cif lt  
l t r. t. i lt  o t
i r l e i l etri
d i ) lt. l arl , 1tost ra io l l odels
n.t r i l lt s. 
Yb.e rr t t r e ,
l t t h r. Yb.ether lt i l
etri ce i st -
e u curr t t abilit .
r. l port t t tem t t im l
. 
a r io l lt odel ci l
i t n  t odels hapt r . o - t o  t 
odel uc o l odel ci l t pli-
e n i t l odel eral l  
187
indeterminate fanlt model* In addition* the bridging fault model is a 
special case of the general single failure indeterminate fault model. 
The only traditional fault models that are not covered by one of the
indeterminate fault models of Chapter 4, are the unidirectional fault 
model and the unidirectional error fault model. The unidirectional 
fault models are intended to cover two distinct type of failures: 
failures of certain global signal lines, and device and line failures.
The first type of failure is usually catastrophic, such as the com­
plete failure of an integrated circuit's power line. If an entire 
integrated circuit loses its power, all outputs drift rather quickly to 
a logic 0 and remain at a logic 0 until power is restored. Such a 
failure clearly results in both a unidirectional fault and a unidirec­
tional error. If a ground line fails instead of a power line, it is 
more difficult to predict precisely how the integrated circuit outputs 
respond. If the integrated circuit is static NMOS, then the outputs 
certainly would all become logic l's. If the integrated circuit is
CMOS, then probably one or more of the circuit's outputs would be 
indeterminate. Therefore, for a global failure of the ground lines in 
CMOS logic, the unidirectional fault model is of questionable validity. 
The indeterminate fault models are unable to model the effects of a glo­
bal signal failure. If it is desirable to protect against such global 
power and ground failures (or any other failure causing a unidirectional 
error), then a two-rail implementation may always be used. A two-rail 
implementation consists of the original circuit that becomes the data 
portion of the circuit and the Boolean dual [56] of the original circuit 
that forms the check portion of the circuit. Such a two-rail
 
n i at ul odel. it , n  t odel
ci l f er l l n i at l odel.
l io l n  t odel that are not covered by one of t e 
n i t lt odel hapter , i i t l n!  
odel i i t l r l odel. i i t l
l odel nten r i t :
l l lines, and device and line i r s. 
usually catastrophic* such s t  -
it' er . ti
e it er, l t t ri t i l
i O ai i O til er .
lt t  i i t l lt i i -
io l r, in e er ,
or i lt r i t i it t t
, it OS, h t t
-,roul l l's. If the integrated circuit is 
OS, l or it' t t oul  
i ate. erefore, l in
i , i i t l l odel st l li it .
n i t l odels l odel t f l -
l l . i l t t i st l
er u i i t l
r), iu u  l entati a . o-rail
l entati sist l it t es t
rt it ool al l it
t r rt i it. o-rail 
188
implementation always exists* and furthermore, if satisfies the func­
tional duplication property. Since a two-rail code is unordered* it may 
be used to detect the occurrence of any unidirectional error. In some 
cases* codes more economical than functional duplication may also be 
unordered. In this situation, the more economical code also detects all 
unidirectional errors. If it is only necessary to detect the situation 
where a power or ground failure causes all outputs to become all logic 
0's or all logic l's* then it is only necessary that the all 0's output 
vector and all l rs output vector not be legal codewords. This require­
ment is significantly less restrictive than requiring an unordered code. 
By carefully assigning the check vectors to the data vectors* it is 
always possible to make the all 0's output vector and the all l's output 
vector be non-codewords as long as the check output vector contains more 
than one bit.
The second type of failure that unidirectional fault models are 
intended to cover* is the single failure of a device or line. Usually* 
this is done for structured elements. For instance* Banerjee [4] shows 
that under certain restrictions* failures in a PLA or decoder result in 
a unidirectional error at the device's output. From the hypotheses of 
Chapter 4* any such failures are modeled by indeterminate faults.
Therefore* all traditional fault models* except the unidirectional 
fault model and the unidirectional error fault model* are special cases 
of the indeterminate fault models. The indeterminate fault models are 
also applicable to unidirectional errors caused by the failure of a sin­
gle line or device. In addition* many of the codes that are derived
 
l entati ist . therm r . t t UD.o-
io l pli t ert . i ce• o-rail r er , a
o t t r i i t l r. o
s, or ical 11Dotional pli t : a 11 ■0  
UDOrdered. tu , o or oaical t t l
i i t l r . l ■ a t t tu io
hor er roUD. lu l t t o l i
O' l , h l t l O' t t
t r l 1'1 t t t r t l ords. i i -
ent i t t iv i r e.
l g t r t ct rs, h 
ssi l a l.I O' t t t l t t
t r or o 11 t t t r t i or
it. 
y o t UDidirecti al l odels
n en er, o i o. sua l ,
ru ents. r , anerj s
t er io •- u r a lt
i i t l t o i e' t ut. o ll.o t
hapter . o odel n i t lt . 
eref re, l io l lt odels, t o i i t l
l odel i i t l l odel, i l
n i t l • els. o i at l • l
l l 11J1 directional -
ino i e. it , a o t  
189
using the indeterminate fault models also protect against all unidirec­
tional errors including those caused by global power and ground 
failures. As mentioned in Chapter 1, Smith [5] has shown that the uni­
directional fault model requires that an implementation be built with 
noninverting gates. This makes the unidirectional fault model useless 
for most MOS circuits.
With the possible exception of some type of unidirectional faults, 
all traditional fault models are merely special cases of our indeter­
minate fault models. Therefore, the indeterminate fault models should 
be more accurate. In addition to the logical type of faults modeled by 
the traditional models, the indeterminate fault models are also able to 
account for parametric-type faults that are beyond the ability of tradi­
tional models to describe. These parametric faults include timing 
failures and oscillations. The biggest limitation of the indeterminate 
fault models is their inability to model the behavior of multiple device 
failures. In many cases, the behavior of such multiple failures will 
map into one of the modeled faults. If a functional duplication code is 
used, then the circuit is protected against an arbitrary number of 
failures of any type as long as these failures only affect either the 
data portion or check portion of the circuit, but not both simultane­
ously .
5.1.2. Ea_$e of Analysis
Hie second criterion discussed in Chapter 1 is ease of analysis. 
The indeterminate fault model is a very easy fault model to work with. 
This is primarily due to the fact that the fault model is comprehensive
 
n i at l odels t t i st l i i -
io l u  l er
. s enti hapt r , it SJ ho t i-
o l lt odel i t l entati ilt ,ri b
i ert t s. i akes i i t l lt odel l
ost it . 
it ssi l ~ t i i t l lt ,
l ra io l l odel erel i l r t r-
i at l odels. eref re, n i t l odels l
.or rat , it i l nl odel
ra io l odels, n i t l odel l
t etri l t il i-
io l odel ri . es etri l im  
il s, st i i o i at
l odel i il odel i r ulti l i
. a s, avi r ulti l u ill
a odel lt . t l li t
, it t i st i ber
u y o l t
t r rt it, t t t l -
sl  
i •l •• · ~ su, nalysi  
Th e io hapt r l si .
i at  l odel l odel or it .
i aril t l odel prehensi  
190
for many types of physical failures. The simplified indeterminate fault 
model accurately represents the behavior of all failures modeled by the 
simplified indeterminate fault model as well as shorts between nodes. 
Therefore* these fault models do not need to be combined with other 
fault models to account for the behavior of all single failures accu­
rately. The traditional fault models often must be combined with other 
fault models in order to cover certain types of physical failures.
If an implementation in the form of Figure 4.6 is acceptable* then 
only faults on circuit inputs or outputs need to be considered for the 
indeterminate fault models. With the traditional fault models* it is 
generally necessary to consider faults on all nodes of the circuit* not 
just inputs and outputs. Typically* a circuit has many more nodes than 
inputs and outputs. Therefore* the number of faults that must be con­
sidered is greatly reduced. It is true that for many of the traditional 
fault models* especially the stuck-at fault model* many faults are 
indistinguishable from other faults. However* even after collapsing the 
fault model* there are usually significantly more stuck-at faults that 
must be considered than simplified indeterminate faults.
The difficulty in using the indeterminate fault models lies in the 
fact that a ternary algebra must be used rather than Boolean algebra. 
Fortunately, the rules of ternary algebra are very similar to the rules 
of Boolean algebra. Furthermore* when the inputs to a ternary function 
are restricted to 0 and 1 values* then the function's behavior may be 
described using Boolean algebra. Thus* perhaps with the exception of
 
a ysi l . pli n i t l
odel r t t i r l u odel
l  l ai: t hu  t odel u ell rt es.
herefore. l odel t bi it r
lt odels t i r l u -
. ra io l lt odel ust bi it r
lt odel r r ysi l . 
l entati ho orm i r •• t l , h
l ol it t 011tputs
n i t l odels. I' i o ra io l l o  l .
eral si r l l it, t
t t t uts. ypicall , it s a n r
t t uts. eref re. 1JJDber l t ust -
d t . r t a ra io l
l odels, ci l - t l odel, a l
ing l rom r lt . oyever. l a
lt odel, al i t 111ore not t l t
ust si pli n i t lt . 
i n i t l odel
t t ust ool r .
rt atel . il
ool  ra. r ore. t t
c e O l s, h cti ' i r ■
ool r . us, it t  I 
191
unfamiliarity, ternary algebra is no more difficult to use than Boolean 
algebra.
In general, the indeterminate fault models should provide good ease 
of analysis. Both indeterminate fault models are comprehensive models 
and only require that a limited number of faults be considered. 
Although ternary algebra is required in order to analyze circuits with 
indeterminate faults, this should provide no real difficulty.
5..1.3. Cost of Fault Tolerance
The cost of fault tolerance for any fault model is highly dependent 
on the target system. Some systems naturally lend themselves more 
readily to concurrent error detection than others. Furthermore, there 
are a variety of costs involved in utilizing any fault-tolerance scheme. 
Such costs include: power cost, size cost, speed cost, and most impor­
tantly, monetary cost. Clearly, a variety of tradeoffs exist between 
each of these costs. Usually one is most concerned with the tradeoff 
between monetary cost and the other types of costs.
When attempting to implement a concurrent error detection scheme, 
one is faced with two basic choices: whether to implement the entire 
system as one single totally self-checking circuit or to divide the sys­
tem into several smaller totally self—checking circuits that are inter­
connected to perform the same function.
A variety of tradeoffs are involved in this decision. All other 
things being equal, the smaller the blocks of logic checked by a 
checker, the better the logic block’s observability, and hence, the
 
f iliarit , or i lt h ool  
 
eral, n i t l odels l i
al sis. otb n i t l odel prehensi odels
l i t imi wnber l si r .
l i it it
n i t lt , l i l i lt . 
1,1,1, !&. l ~ ult l  
st l era lt odel e t
t e . te t r l e he .sel or
i curr t t t rs. r ore,
ri st n t izin -toleran e. 
st : er st, st, st, ost or-
t , onetar  st. l arl , r ra f i t
sts. suall ost i ra  f
onetar st r sts. 
he tem t l ent urr t t t e,
i w si i s: het e l ent t
e f-che it -
em r l all -ch i  it t -
ect orm t . 
ri ra f n i . ll r 
al, all r i
er, tt i ' servabilit , ce,  
192
easier the block is to test. Therefore s breaking the system into 
several smaller circuits is generally advantageous in regard to increas­
ing the system's testability. It is also usually easier to analyze 
small circuits as opposed to large circuits. Thus * it is usually easier 
to find totally self-checking implementations if the system is broken 
into a number of smaller parts. It is not always obvious how to parti­
tion a system into a number of smaller parts in order to maximize testa­
bility and minimize the difficulty in finding a totally self-checking 
implementation of the system. In general• it is often desirable to par­
tition the system into its functional parts such as adders, register 
banks, busses, etc. Such a partition usually allows an efficient imple­
mentation.
An alternative to partitioning is to implement the entire system as 
one totally self-checking circuit. In general, this approach results in 
poorer testability, possibly more logic (and thus higher power consump­
tion), and larger system size. However, for large and very large scale 
integrated circuits, this has an important advantage. By duplicating 
standard off-the-shelf circuits, totally self-checking circuits can be 
built quite cheaply. Due to the high development cost and relatively 
low manufacturing cost, the price of a very large scale integrated cir­
cuit is a strong function of the number of identical circuits manufac­
tured [34] . Typically, the demand for systems with concurrent error
detection is smaller than the demand for the same or a similar system
without concurrent error detection. If custom integrated circuits must 
be designed, then the monetary cost of a system with concurrent error
detection is much greater than the monetary cost of a similar system
 
i o 1 at. eref r . tem
o or l all r it o orall t so -
o ' ili . h l i
all i o it, us. l i r
n f-che la ontati a
bo na l o rts. t i rti-
io tem ber all r rt r axi i -
i 111ini i . i l in n f-ch i s
pl entati e . er l. 11 osi lo r-
tio tem t l rt u ers, i t
s, si s, . rt io al lo t pl -
entati . 
e rt ion l ent t tem
f-che it. eral, l
r r ili , ssi or i er -
), tem &e. o ever. a l
it • port t t e. pli t
a elf it , f-che it
ilt i l . op e t st iv
o anufact ri st. a o n e -
it ro t o o ber t l i anufac-
] ypica l . a for systems ,rith concu rent  
t sm l er than tho demand for the same or il r tem 
,ri t curr t t t . tom e it• ust
. then tho aon tary co t of a system w th concu rent  
t la  tha tho m ary cost of a si il r tem 
193
built by duplication with off-the-shelf parts. Even if a custom 
integrated circuit must be designed* duplication still simplifies the 
design process since very little analysis is required. Unless power 
consumption and/or size is an overriding consideration, then any time a 
very large scale integrated circuit already exists that performs the 
desired function, the best way to gain concurrent error detection is 
simply to use two copies of the existing integrated circuit.
Intel's iAPX 432 family [66] uses this approach so that the same 
set of integrated circuits may be used for those applications that 
require concurrent error detection and those applications that do not 
require concurrent error detection (i.e., those where the benefits of 
concurrent error detection are outweighed by its cost). Each of the 
integrated circuits in the iAPX 432 family are designed so that each 
output pin may also serve as an equality checker. One pin is devoted to 
"programming" the chip to be a master (circuit operates normally), or a 
checker. All pins on the master and checker integrated circuits except 
the programming pin and the error indication pin are wired together. 
Any discrepancy between the logical values of the integrated circuits' 
outputs are indicated by the checker circuit. The only errors that are 
not detected by this scheme are the failure of certain of the global 
signal lines. Many of the issues involved in protecting against global 
signal failures are discussed in [67].
In almost all cases, the cost of fault tolerance with an indeter­
minate fault model will be greater than or equal to the cost using a 
traditional fault model. This is due to the fact that except for the
I 
 
ilt li o it l r tom
e it ust , pli  1 plifi
l si  i . nles er 
pti / r err si erati , h im
l it rea i t t or
t , st a i rr t i- r t
i l  i  i n e it. 
t l' A  a i ] t  
t n e it a li t 
i curr t t t li t t t
i rr t t ., her nefit  
curr t t i st).
e it I i t
t t a ali ker. t
ogram i i ast r { i it r t all
er. ll i ast r er it t
o ram i o i t er.
rep i l n its'
t t r it. l t
t t u l
l . a t n i st l
~i l ( ). 
st l s, st lt era it t -
i at lt odel • t h al st
a io l lt odel. i t t t  
194
unidirectional fault model (which is not applicable to logic constructed 
from inverting circuits) and in some cases the unidirectional error 
model. all the traditional fault models are only special cases of the 
indeterminate fault models. Any implementation that provides concurrent 
error detection for indeterminate fault models will also provide con- 
current error detection for the traditional fault models (except the 
unidirectional fault models). Therefore, the cost of fault tolerance 
with the traditional fault models will always be less than or equal to 
the cost of fault tolerance for the indeterminate fault models.
As we have mentioned above, duplication (whether at the intra- 
integrated circuit level or the inter-integrated circuit level) has many 
advantages, especially for systems that are produced in low numbers. In 
many cases, duplication will be used regardless of the fault model. 
Therefore, as a practical matter, the cost of fault tolerance in most 
cases is roughly the same whether indeterminate fault models are used or 
one of the traditional fault models is used.
5 . 2 .  Snmirmrv
In Chapter 2, typical physical failure models are reviewed. Three 
broad classes of physical failures are considered: interconnect 
failures, transistor failures, and radiation-induced soft failures. 
Interconnect failures result in shorts and opens in the lines that link 
the transistors. Transistor failures are caused by a shift in device 
parameters and device breakdown. Radiation-induced soft failures are 
transient, non-recurring upsets of a node or nodes in the circuit caused 
by high energy radiation generating charge carriers in the integrated
 
i i t l l odel i t l l st
rom r it i i t l
odel. l ra io l lt odel l ci l a
n i t l odels. li n a t t i urr t
t t n i t l odels ill i -
r t t ra  U l lt odel t
a i i et l a11l  111odel ). eref re. st a1il t era
it ra io l l odels ill h al
st l era n i at l odels. 
s n t  • l le    o 
  it •ol integ it l a
ntages, ci l te t • bers,
a s, li ill a l l odel.
eref r , t l a ter. st l era ost
l a het er i at l odel
ra io l l odels . 
hapt r , i l ysi al odel e . r
si l u si r : n erco ct 
, ra . ion-indu ft .
erco t lt rt in t inl:
r i r . r nsist u ift i
U1eter i n. adi t o ind ft o 
i t, -r r set f it
o JY i o in& r a e  
195
circuit. Circuits become more susceptible to all three of these types 
of failures as devices are scaled.
In Chapter 3, the effects of these failures on integrated circuits 
are studied. It is found that nearly all of these failures may be 
modeled as resistive shorts or opens in a circuit. Models are developed 
for static NMOS, static CMOS, and dynamic NMOS inverters. These models 
are used to predict the behavior of inverters under failure. It is 
shown that when physical failures occur, the logic levels of the 
inverter output may degrade, the inverter switching speed may decrease 
and under some circumstances, the inverter output may oscillate. 
Integrated circuits are also constantly exposed to the effects of random 
noise which may cause soft failures, similar in nature to radiation- 
induced soft failures. Thus, the analysis of Chapter 3 shows that phy­
sical failures, in general, may cause the output of a failed circuit to 
assume a value that is logically undefined.
The behavior of good circuits with logically undefined inputs is 
examined. It is also shown that a flip-flop may undergo metastable 
operation when its inputs are undefined logic values. When a flip-flop 
is in a metastable state, its outputs are generally illegal logic 
values. Since clocked flip-flops are commonly used to separate blocks 
of combinational logic, the effect of circuit parameters on the proba­
bility of entering a metastable state and average length of metastable 
operation is studied. It is shown that high gain and high bandwidth are 
important to minimize the effects of metastable operation.
 
it. ir it or pti l l
f i l . 
hapt r , t u e it
i . t ;r l u a
odel iv rt it. odel
S, S, i rt rs. s odel
i t avi r rt er .
t he si l ur, i l
rt t t a r e, rt i a
er rcu st ces, rt t t a cill t . 
it st t t ando
i hi a ft , i t i
n ft . us, l si hapter s t -
l , eral, a t t e it  
su t i l efi . 
i r it it i l ef t  
i ed. t ip flo a etast l  
er t he t ef i l es. he ip flo
etast l , t t erall le l i
l s, o ip flo onl r t  
binati al i , t it 811leters -
i etast l e etast l
er t , t i t
port t i i i t etast l erati . 
196
In Chapter 4* concurrent error detection for errors caused by phy­
sical failures is discussed. Indeterminate faults are used to represent 
the undefined logic values that may occur due to physical failures. It 
is shown that indeterminate faults may also be used to represent the 
behavior of any failure that forces a single node of the circuit to a 
legal logic value. A ternary algebra is used to describe the behavior 
of logic gates with indeterminate fault inputs. By using the ternary 
algebra* it is shown that static hazards and p-variable logic hazards 
will sensitize an output to an indeterminate fault* even when the output 
is not a function of the faulted node.
The traditional definitions for fault-secure* self-testing* and 
totally self-checking are discussed. It is shown that due to the non- 
deterministic behavior of indeterminate faults* these definitions are 
inappropriate for systems that are subject to physical failures that may 
cause indeterminate faults. New definitions of fault-secure* self test­
ing* totally self-checking* and strongly fault secure are given that are 
compatible with indeterminate faults.
Two fault models are introduced that are based on indeterminate 
faults. The concept of functional duplication is introduced. It is 
shown that a functional duplication implementation* that satisfies the 
totally self-checking goal, exists for any switching function. Pro­
cedures are also discussed for each of the fault models to find any 
codes that may exist for a function that are less costly than functional 
duplication. The problem of generating the check output vectors when
lH 
hapter , curr t t -
l u . i at l t
n ef i l t a r ysi al .
h t n o el'lftinat l a ao t o
i r t o it
o l 1h l . 1 i r
i t i i at lt ts. e
r , o t ri l i
ill sit t t n i t lt , o t t
t 11J1cti o e. 
ra io l fi i io l s r , ,
f-che 1 su . h t ho -
t inisti i r n i t lt , fi i io
r pri t e t j t si l u t a
i t lt . fi i o l s r , l t-
s, f-ch i , ro lt t
pati l it n r i t lt . 
l odel ntrod t n i at
lt . t t l li t ro .
t t l li l entati , t t o
f-che al, i t i t . r -
r o o lt odel n
t a %ist t t st t l
pli t . em er t o t t t r he  
I 
197
the circuit in question is part of a larger totally self-checking system 
is also examined.
5.3. Suggestions for Future Research
One of the major detriments to systems that are totally self­
checking with respect to the indeterminate fault model is the testing 
problem. Further research into methods that generate efficient and 
effective tests for indeterminate faults is necessary in order to 
improve the concurrent error detection capabilities of such systems. As 
mentioned in Chapter 4, testability techniques for intermittent failures 
appears to be a very promising foundation for developing such techniques 
for indeterminate faults.
It would be desirable to extend the research of Chapter 4 to cover 
a broader range of possible circuits and implementations. Since sequen­
tial networks are such an important class of circuits, it is imperative 
to study them explicitly and develop the requirements for providing them 
with concurrent error detection capability. Non-separable codes should 
also be examined to determine if such codes might provide more economi­
cal implementations of certain functions than separable codes.
The algorithms presented in Chapter 4 to search for codes more 
economical than functional duplication are straightforward to apply. 
Unfortunately, for functions with a large number of inputs and outputs, 
these procedures may become quite unwieldy to apply. For this reason, 
new search algorithms should be developed to find such codes more effi­
ciently.
 
it est rt a f-che tem
i ed, 
~-i, ogsest t r es  
aj r im t te t l -
i t i t nlt odel in
e . rt r et t er t t
n i t l r
r curr t t abilit s,
enti hapter . il ec it t u
ar isi at l e
n i t lt . 
oul i l hapter r
r ssi l it pl entati ns. -
r port t it , perati
u hem pli it l re t i he
it urr t t abilit . on-separa l l
i r i i ht i or i-
l l entati r t o r l es. 
thm hapter or
ical t l li t raightforw l .
nfortunatel , t i ber t t uts,
r a i iel l . r ,
thm l n or -
tl , 
One of the major difficulties in using the general single-failure 
indeterminate fault model is the problem of designing appropriate check­
ers* Therefore* designs of checkers for the general single-failure 
indeterminate fault model should be studied*
198 
■ i l o o er l l - i a
i at al ■ o l • r pri t -
. eref re, o a er o er l l i
n i at l odel l . 
199
References
[1] T, W. Williams and K. P. Parker, "Design for testability - a 
survey," Proc. IEEE, vol. 71, no. 1, Jan. 1983, pp. 98-112.
[2] S. R. McConnel, D. P. Siewiorek, and M. M. Tsao, "The measure­
ment and analysis of transient errors in digital computer sys­
tems," IEEE Int. Svmo. Fault-Tolerant Computing. Los Angeles, 
1979, pp. 67-70.
[3] J. H. Patel and L. Y. Fung, "Concurrent error detection in 
ALU's by recomputing with shifted operands," IEEE Trans. Com­
puters , vol. C—31, no. 7, July 1982, pp. 589—595.
[4] P, Banerjee, "A model for simulating physical failures in MOS 
VLSI circuits," Report CSG-13, Coordinated Science Laboratory, 
University of Illinois, Urbana, IL, 1982.
[5] J. E. Smith, "The design of totally self-checking combinational 
circuits," Report R-737, Coordinated Science Laboratory, 
University of Illinois, Urbana, IL, 1976.
[6] D. F. Barbe, Very large scale integration (VLSI) Fundamentals 
and Applications. Berlin: Springer—Verlag, 1980.
[7] S. Vaidya, D. B. Fraser, and A. K. Shinha, "Electromigration 
resistance of fine—line Al for VLSI applications," Proc. Int. 
Reliability Physics, 1980, pp. 165-170.
[8] P. A. Gargini, C. Tseng, and M. H. Woods, "Elimination of sili­
con electromigration in contacts by the use of an interposed 
barrier metal," Proc. Int. Reliability Physics, 1982, pp. 66- 
76.
[9] G. DiGiacomo, "Metal migration (Ag, Cu, Pb) in encapsulated 
modules and time-to-fail model as a function of the environment 
and package properties," Proc . Int. Re 1 iab il itv Phy.$ i.c.§., 1982, 
pp. 27-33.
ef  
1 . , ill a . . r er. esi il
, E  . , . , . , . . 
1 . . cCo nel. . . i , . . s . easur -
ent l si t i it l puter -
e s, E hi., n, EIJ .li- olera t puti s ngeles,
, . - , 
. t l . . g, ncurr t t o
' puti 'lfit i r ds,'' E ~ 
t,, . l. - , . , 2, . - . 
. anerjee, odel bula in si l u
SI it , eport -13, oordi t aborat r ,
niversit lin i , r ana, , 2. 
.E it , '' f-che binati al
it , eport - 7, oordi at i  aborat r .
niversit lin i , r ana, , 6. 
1 , . arbe,~ tare.A l o I da entals
.A. . . ppli t o erli : r r- erl , 0. 
1 . ai ya, . . r ser, . . i a, l o i r t
stan - in l SI plications,"~- . Ju..
eli bili vsi , PP• - 0. 
, . argini, . s , . . ods, l i .ti -
ro i t'at t ct erp
rri etal."~- .h. , eli bili ysi , 2, PP·
. 
1 . i i o, etal igrat , u, ) s l t
odules i e-t -fail odel o rorunent
ert "~- !Ju. eli bil y hysi s. 2, 
- . 
 
200
[10] J. R. Lloyd* G. S. Hopper* and W. B. Roush* "In situ IR obser­
vation of electromigration induced damage in heavily doped po­
lycrystalline silicon resistors*" Proc. Ifti. Reliability Phy­
sics. 1982, pp. 47-49.
[11] Me R e Polcari, J. R. Lloyd, and S. Cvikevich, "Electromigration 
failure in heavily doped polycrystalline silicon*" Proc Int 
Reliability Phytifll* 1980, pp. 178-185.
[12] H. C. Potter and D. R. Reber* "A study of surface charge in­
duced inversion failure of junction isolated monolithic silicon 
integrated circuits," Proc. Int. Reliability Physics* 1976, pp. 
11-17.
[13] B. A. Unger* "Electrostatic discharge failures of semiconductor 
devices," Proc. Iflt,. Reliability Physics* 1981, pp. 193-199.
[14] A. R. Hart, J. Smyth, and Stan Gorski, "Predicting ESD related 
reliability effects," Proc. Int- Reliability Physics* 1982, pp. 
233-237.
[15] E. S. Anolick, "Screening of time-dependent dielectric break­
downs," Proc. Int. Reliability Physics, 1982, pp. 238-243.
[16] B. Evzent, "Hot electron injection efficiency in IGFET struc­
tures," Proc. Int. Reliability Physics, 1977, pp. 1-4.
[17] B. Eitan and D. Frohman-Bentchkowsky, "Hot-electron injection 
into the oxide in n-channel MOS devices," IEEE Trans. Electron. 
Devices. vol• ED—28, no. 3, pp. 328—340, March 1981.
[18] P. K. Chaudhari, "Leakage-induced hot carrier instability in 
phosphorus-doped Si02 gate IGFET devices," Proc. Int. Reliabil­
ity Physics, 1977, pp. 5-9.
[19] S. A. Abbas and R. C. Dockerty, "Hot electron induced degrada­
tion of n-channel IGFETs," Proc. Ia£. Reliability Physics, 
1976, pp. 38-41.
[20] M. Nojori and T. Ishihara, "Secondary slow trapping - a new 
moisture induced instability phenomenon in scaled CMOS dew- 
ices," Proc. Xn£. Reliability Physics, 1982, pp. 113-121.
1. ll. l . . . Bopper, 1'. . oush, n. dt11 ll s r-
t o t ai n ag i -
st l ico . hw;. lJu.. eli bili lla:-
llU.• , . - . 
) II. . lcari. 1 . . l , . vi:tevich, Bh a
il'll vil l crystall lico , lxR.5-• _hl. 
eli bili hysics. , PP • - 5. 
. . tt . ll. eber, u -
lu ctio s a e onolit i ico  
e ircuits,"~- h , eli bili s 1. , PP • 
- . 
( ] . . nser, l ctr st t  a i duct r
i es,• ~ - h,t eli bili flL.T1ic1. , PP• - 9. 
U . It. Bart, 1. yth. orski, r i t B e
il ffects,"~- h1,. eli bili si ,1. 2, PP·
-1 7. 
1 . . noli , im t i l t -
ns,  lm£ At• eli bili si . 2. . - 3. 
( . zont, ot io icien G -
, ~ - la. - eli bili si . 7, . . 
] . IHt . an- entc o sky, ot- l t o
o i n-ch l i s, E r l o
eyi l. -1 , . , , - . Kar 1. 
1 . I:. haudhari . e- t r il
r t G i es, ~ 11.1, eli bil-
~ si . , PP• . 
. . as . . ockert , ot n -
io el Ts, ~ .!D.1• eli bili si .
, .  - . 
( ] . ojori . , r lo rap
oist r n il nom o: v
lt.a.g_ .1, eli bili si . 1, . - 1. 
 
201
[21] J. P. Mitchell and D. K. Wilson, "Surface effects of radiation 
on semiconductor devices," Bell Systems Technical J ournal, vol. 
XLVI, no. 1, Jan. 1967, pp. 1-80.
[22] E. H. Snow, A. S. Grove, and D. J. Fitzerald, "Effects of ion­
izing radiation on oxidized sillicon surfaces and planar dev­
ices," Proc. IEEE, vol. 55, no. 7, July 1967, pp. 1168-1185.
[23] I. N. Krishnan and T. M. Chen, "G-R noise and microscopic de­
fects in irradiated junction field effect transistors," Solid- 
State Electron., vol. 20, Nov. 1977, pp. 897-906.
[24] S. A. Abbas and E. E. Davidson, "Reliability implications of
hot electron generation and parasitic bipolar action in an IG- 
FET device," Proc. Int. Reliability Physics, 1976, pp. 18-22.
[25] R. R. Troutman and H. P. Zappe, "A Transient analysis of latch- 
up in Bulk CMOS," IEEE Trans. Electron. Devices. vol. ED-30, 
no. 2, Feb. 1983, pp. 170-179.
[26] C. M. Esieh, P. C, Mur ley, and R. R. O'Brien, "Dynamics of
charge collection from alpha-particle tracks in integrated cir­
cuits," Proc. Int. Reliability Physics, 1981, pp. 38-42.
[27] M. L. White, J. W. Serpiello, K. M. Striny, and W. Rosenzweig,
"The use of silicone RTV rubber for alpha particle protection
on silicon integrated circuits," Proc. Int. Reliability Phy­
sics. 1981, pp. 43-47.
[28] J. Galiay, Y. Crouzet, and M. Vergniult, "Physical versus logi­
cal fault models M0S LSI circuits: impact on their testabili­
ty," IEEE Trans. Computers, vol. C-29, no. 6, June 1980, pp. 
527-531.
[29] C. Mead and L. Conway, Introduction to VLSI systems. Reading: 
Addison-Wes1ey Publishing, 1980.
[30] B. G. Streetman, Solid state electronic devices. Englewood 
Cliffs: Prentice-Hall, 1980.
[31] R. D. Davis, "Design and analysis of an NMOS operational am­
plifier with depletion loads," Report R-857, Coordinated Sci­
ence Laboratory, University of Illinois, Urbana, IL, 1979,
. itchell . I ils , r t o
i duct r i s, hll. e s echnical . l.
I, . , , , . . 
) . , , . . r ve, . i r l , ff t -
in o 011 i lico -
, :c E . l. , . , , . - 5. 
1 . ri . . en, i i s i -
rr t e si ~-
~ l ron •• l. , ov, 7, , - 6. 
) , A. A s . , a i , '' eliabili pli t  
t er t rasit i l r o -
i e, c . .l.Ju. &,li ili fus,jc_s_, , . - . 
) .R. .P pe, r ~si t l si
o l OS," E :rx...u l ro evi l. -30,
. , . 3, . - 9. 
) . M, Hsieh. P. C, Murley, and .R ' ri , ic  
l o rom - arti l ra e -
uits,"~- . eli bili v vsi . 1. . - . 
1 . L. hit . J. W. Serpiel o, K. . t , . os z eig, 
use of silicone RTV rub er rt t  
lico e it , f.. ...2.£ h, eli ili v flu.-
t ll, 1, . - . 
] ali , . r uzet, . ergniolt, ysi al i-
l lt odel O I it : act i e  i  -
IuJu.. puter . l. - , . , 0. .
- 1. 
( 1 . e . ay, ro t . Q. SI v tem . eadi :
ddi esl bli i , . 
( ) , . , l ~ i i l
lif : r ti - a l, 0, 
] . . avis, t1 esi l si r t l -
li it l t s, eport - 7. oordi at i-
borat r , niversi in i , r ana, , . 
 
202
[32] L. A. Glasser* "The analog behavior of digital integrated cir­
cuits* " Design Automation Conf., 1981* pp. 603-612.
[33] L. P. J. Hoyte* "Automated calculation of device sizes for di­
gital IC designs*" M.S. Thesis* Massachusetts Institute of 
Technology* Cambridge* MA* 1982.
[34] S. Muroga* VLSI system design. New York: John Wiley and Sons* 
1982.
[35] D. J. Hamilton and f. G. Howard* Basic integrated circuit £&" 
gingering. Reading: Addison-Wesley, 1979.
[36] L. Strauss* Wave generation and shaping. New York: McGraw- 
Hill, 1970.
[37] M. Karpovsky and S. I. H. Su* "Detection and location of input 
and feedback bridging faults among input and output lines," 
IEEE Trans. Computers* vol. C—29* no. 6, June 1980* pp. 523— 
527.
[38] T. Yamada and T. Nanya* "Comments on 'Detection and location of 
input and feedback bridging faults among input and output 
lines'*" IEEE Trans. Computers. vol. C-32, no. 5, May 1983, pp. 
511-512.
[39] R. E. Ziemer and W. H. Tranter* Principles ol gommuflicatiQn?♦ 
Boston: Houghton Mifflin, 1976.
[40] J. T. Wallmark* "Noise spikes in digital VLSI circuits*" IEEE 
Trans. Electron. Devices * vol. ED-29, no. 3, March 1982, pp. 
451-458.
[41] S. H. Unger, "Asynchronous sequential switching circuits with 
unrestricted input changes," IEEE Trans. Computers, vol. C-20, 
no. 12, Dec. 1971, pp. 1437-1444.
[42] L. R. Marino, "The effect of asynchronous inputs on sequential 
network reliability," IEEE Trans. Computers, vol. C-26, no. 11, 
Nov. 1977, pp. 1082-1090.
[43] B. I. Strom* "Proof of the equivalent realizability of a time- 
bound arbiter and a runt-free inertial delay," Sixth AsfiB&l 
Svmp. Computer Architecture* 1979, pp. 178-181.
] . . lasser. '' avi r i it l s i -
it , w esi~ ut ati ~ •• , PP • 1 . 
. . 1. Boyto, ~ t at l o i i-
i l , . .S. hesis, llau aett t
nol a , a bri ge , II , . 
( . Kuro1a, SI v1 p p1J1 . or : il s,
. 
S . . Buail W. . B ard, D.a.l..i.Q. a it a-
tinee ina eadi g: ddi esl 9. 
( ) . t ss, hll uner o w 1, ,r or : c ra,r-
ill, 0. 
) It. 1:arpovs d. . Y. B. , et t t
ee 1 l o t t t in ,
E r puter , l. - , . , , . -
. . 
] . a . anya, ents D t o
t 1 1 ~l ui t t t
', EHR una puter l. - . . , a 3. .
- 2. 
1 . . er . B. r nter. r i l . .f. c municat o s.
ost : ought i fli , . 
) r . . lf l r , '' oi i u iai l SI it ,
il.AAI.• l ro . ni u. l. ~2 . . . ar 2, .
,4 8. 
] . B. nger, s r s enti l 1,rit  i i
restr t es, E~. puter l. - .
. 1, eo. 1, . - 44. 
1 . . ari , '' . t lll t enti l
r il , E ll.lll.l.• o puters. l. - , . ,
ov. , PP• - 90. 
 ] . t •, f i l t l il im -
i m t rt l ol , lix. ll nnual
~- og t  rchit . . PP• - 1. 
 
203
[44] L. R. Marino, "General theory of metastable operation," IEEE 
Trans * Computers, vol. C—30, no. 2, Feb. 1981, pp. 107—115.
[45] T. J. Chaney, S. M. Ornstein, and W. M. Littlefield, "Beware 
the synchronizer," IEEE Compcon, 1972, pp. 317—319.
[46] I. Catt, "Time loss through gating of asynchronous logic signal 
pulses," IEEE Trans. Computers, vol. C—15, no. 2, Feb. 1966,
pp. 108-111.
[47] M. J. Stucki and J. R. Cox, "Synchronization strategies," Cal­
Tech Conf. VLSI, 1979, pp. 375-393.
[48] P. A. Stoll, "How to avoid synchronization problems," VLSI 
Design. Nov./Dec. 1982, pp. 56-59.
[49] T. J. Chaney and C. E. Molnar, "Anomalous behavior of synchron­
izer and arbiter circuits," IEEE Trans. Computers, vol. C-22, 
no. 4, April 1973, pp. 421-422.
[50] M. Pechoucek, "Anomalous response times of input synchroniz­
ers," IEEE Trans. Computers, vol. C-25, no. 2, Feb. 1976, pp. 
133-139.
[51] T. J. Chaney and F. U. Rosenberg, "Characterization and scaling 
of MOS flip flop performance in synchronizer applications," 
Cal-Tech Conf. VLSI, 1979, pp. 357-374.
[52] G. Lacroix, P. Marchegay, and G. Piel, "Comments on 'The 
Anomalous behavior of flip-flops in synchronizer circuits'," 
IEEE Trans. Computers, vol. C—31, no. 1, Jan. 1982, pp. 77—78.
[53] D. E. Muller, "Treatment of transition signals in electronic 
switching circuits by algebraic methods," IRE Trans. Electronic 
Computers, vol. EC-8, no. 3, Sept. 1959, p. 401.
[54] E. B. Eichelberger, "Hazard detection in combinational and 
sequential switching circuits," IBM Journal, vol. 9, no. 2, 
March 1965, pp. 90-99.
[55] D. B. Armstong, "On finding a nearly minimal set of fault 
detection tests for combinational logic nets," IEEE Trans. 
Electron. Computers, vol. EC-15, no. 1, Feb. 1966, pp. 66-73.
. , ari o, eneral etast l r t , E
!.a.Ju. nter - , . , , 1. . - . 
{ ] . . haney, . . rnstei , . . itt , ar  
r nizer, E pc . 1, . - . 
] a t, i h o t i l
l s, E o puter l. - , , , . 6, 
, - 1. 
( .. t i , . ox, chronizat , .W-
I 9, . 1 - 3. 
{ ] . . t ll. B r ni at s, SI
esi . ov./ ec, , . - . 
{ ] , , , . olnar, al s i r -
i it , E '.!i:m, puter . l. -
. , pril 3, , - 2. 
 ] . cek, al s im t r i -
, E r , puter l. - 5, , , , , .
9. 
 ] . ha . . osenberg, haracteri t e
o form r i r li t s,
W !&n.,f I, 9, . 4. 
, i:1:, . archegay, , i l, !llents T
no alous i r ip-flo r i r it •,
E r , puter . l . - , , , . 2, - . 
. . u ler, ent r io l i
i it r i et ods, l t i
o puter . l. - , , , pt. 9, , . 
] . . i l azar t o binati al
enti l i it , B , l. , , ,
ar 5, . - . 
SSJ . . r st ng, n n r i i al t l
t o binati al i t , E r
l o , puter , l . , . .F .196 , . . 
 
204
[563 Z. Kohavi, Switching and liaJLLfe m £ 9 m £ &  th s& X Z - New York: 
McGraw-Hill, 1978.
[573 J. S. Jephson, R. P. McGuarrie, and R. E. Vogelsberg, "A 
three-value computer design verification system," IBM Sy ft» 
Journal. vol. 8, no. 3, 1969, pp. 178-181.
[583 J. E. Smith and G. Metre, "Strongly fault secure logic net­
works," IEEE Trans. Computers, vol. C-27, no. 6, June 1978, pp. 
491-499.
[59] P. Duhamel and J. C. Rault, "Automatic test generation tech­
niques for analog circuits and systems: a review," IEEE Trans.
Circuits and Systems. vol. CAS—26, no. 7, July 1979, pp. 411— 
440.
[603 S. Kamal and C. Y. Page, "Intermittent faults: a model and a
detection procedure," IEEE Trans. Computers, vol. C-33, no. 7, 
July 1974, pp. 713-719.
[613 J. Savir, "Testing for single intermittent failures in combina­
tional circuits by maximizing the probability of fault detec­
tion," Report 145, Center for Reliable Computing, Stanford 
University, Palo Alto, CA, 1977.
[62] J. W, Beyers, L. J. Dohse, J. P. Fucetala, R. L. Eochis, C. G. 
Lob, G. L. Taylor, and E. R. Zeller, "A 32-bit VLSI CPU chip," 
IEEE Journal Solid-State Circuits, vol. SC-16, no. 5, Oct. 
1981, pp. 537-542.
[63] W. W. Lattin, J. A. Bayliss, D. L. Budde, J. R. Rattner, and W. 
S. Richardson, "A methodology for VLSI chip design," Lambda, 
Second Quarter 1981, pp. 34-44.
[64] M. A. Breuer and A. D. Friedman, Diagnosis and reliable fljLfiga
of digital systems. Rockville: Computer Science Press, 1976.
[653 C. C. Beh, K. H. Arya, C. E. Radke, and K. E. Torku, "Do stuck 
fault models reflect manufacturing defects?," Int. Test Conf., 
Philadelphia, 1982.
[663 R. Grappel and J. Hemenway, "Understand the newest processor to 
avoid future shock," EDN. vol. 26, no. 9, April 29, 1981, pp. 
129-136.
(5 1 z. l: havi. it l 1 finite auto ata eory. e,, ort:
cGra,r--Hill, , 
] . , K. . NcGuarrie, . R, o1el1bor1, ~
puter i ri io ,• lR h.1.1,
, l. , . , 9, PP• - 8. 
] :r . . ai . otr.e, t l l o o 1i ot-
,rorb ,  , E ll™• .paten l. -1  . , 1lJJl , . 
◄  
( . el 1. . B.ault, ut ati t 1 e:ut -
it• ■ : , E r s. 
i gi 1 .AA4. v1 e l, S-26, , 1, 9, , -
◄  
] . la al , V. o, itt t l : odel  
t o ure, Jmi a pgten. l, - 3, , .
1 l 4, , - , 
] . vir, eti s l itt t i o bi a-
io l it a:d .i:zi abili lt t -
   ei,ort . ent r eli l puting,
niversit , l lt , , 7. 
] . . eyers, . ohse, .P cetala, . . ICoohis, . .
b, . . aylor, . K. o ler, ' ' - it I i ,  
lm r al WJJl h..all ir uits. l. - 6, . . ct.
, , - 2. 
1 Y, . atti , 1. . ayliss, . , udde. 1. . att er, .
s. i r , 11ethodol SI i i , .
uart r 1, , - . 
6 1 . . r er . . ed a , i 1 si1 m a desi1n 
su. i it l v tem ockvill : puter i r ss, 6. 
J , . eh, I B. rya, , . e , IC. . r ,
lt odel t anufact ri f t , la , It.J.t ~ •
il l i , 2. 
] . r el 1. e enway. nderst ,rost ,
i , IWf, l. , . , pril , 1, .
- 6. 
 
205
167] R. M e Sedmak and H. L, Bergot* "Fault tolerance of a general 
purpose computer implemented by very large scale integration*" 
IEEE Trans* Computers* vole C-29* no* 6* June 1980* pp. 492- 
500.
I ] . . a . . ergot, nt t era eral
puter l ent l .
E r . puter . l. - 9, . , , . -
. 
 
206
Vita
Daniel Lee Halperin was born January 23, 1957 in Oak Ridge# Tennes­
see. He graduated first in bis class in the College of Engineering at 
the University of Tennessee in 1978 with a B.S. in Electrical Engineer­
ing. He received bis M.S. and Pb.D. in Electrical Engineering in 1981 
and 1984, respectively, from tbe University of Illinois. While at tbe 
University of Tennessee, be was inducted into tbe Phi Eta Sigma, Eta 
Kappa Nu, Tau Beta Pi, and Phi Kappa Phi honorary societies. While pur­
suing bis graduate studies at tbe University of Illinois* be was a 
member of the Computer Systems Group of tbe Coordinated Science Labora­
tory. He is currently employed by Hewlett-Packard as a member of tbe 
engineering staff in tbe System Technology Operation at Fort Collins,
Colorado.
it  
aniel al eri as 1 r  ,  a at Ki ao, a oa-
. B s h 1 , o oll 10 gi oori s t
o niversi e ness  it . . l ctri l ngi eer-
. B hi X. . h. . l ctri l .ginoeri
, t l , ■ ho niversit h. hil t h
niversit e ne see, h as t ho i t a, Bt
la u, et i, i la i r i ti . hil r-
hi at o t ho niversit lin i , ho • 
e ber puter e r h oordi at on. o abora-
. r t pl owle t e ber f h
o 1i r 1 ho tem. l perati rt o li s, 
ol  
