Parallel-Concurrent Versus Concurrent Fault Simulation by Saab, Daniel Georges et al.
December 1987 UILU-ENG-87-2278
DAC-7
COORDINATED SCIENCE LABORATORY
College of Engineering
PARALLEL- 
CONCURRENT 
VERSUS 
CONCURRENT 
FAULT SIMULATION
Daniel G. Saab 
Joseph T. Rahmeh 
Ibrahim  N. Hajj
UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Approved for Public Release. Distribution Unlimited.
UNCLASSIFIED___________
SECURITY CLASSIFICATION OF THIS PAGE
1a. REPORT SECURITY CLASSIFICATION
Unclassified
1b. RESTRICTIVE MARKINGS
None
2a. SECURITY CLASSIFICATION AUTHORITY 3 DISTRIBUTION/AVAILABILITY OF REPORT
Approved for public release; 
distribution unlimited2b. DECLASSIFICATION / DOWNGRADING SCHEDULE
4. PERFORMING ORGANIZATION REPORT NUMBER(S) 
UILU-ENG-87-2278 (DAC-7)
5. MONITORING ORGANIZATION REPORT NUMBER(S)
6a. NAME OF PERFORMING ORGANIZATION 
Coordinated Science Lab 
University of Illinois
6b. OFFICE SYMBOL 
(If applicable)
N/A
7a. NAME OF MONITORING ORGANIZATION
Semiconductor Research Corporation
6c. ADDRESS (City, State, and ZIP Code)
1101 W. Springfield Avenue 
Urbana, IL 61801
7b. ADDRESS (City, State, and ZIP Code)
Research Triangle Park, NC 27709
8a. NAME OF FUNDING /SPONSORING 
ORGANIZATION Semiconductor
Research Corporation
8b. OFFICE SYMBOL 
(If applicable)
9. PROCUREMENT INSTRUMENT IDENTIFICATION NUM8ER 
SRC 86-12-109
8c. ADDRESS (City, State, and ZIP Code)
Research Triangle Park, NC 27709
10. SOURCE OF FUNDING NUMBERS
PROGRAM 
ELEMENT NO.
PROJECT
NO.
TASK
NO.
WORK UNIT 
ACCESSION NO.
REPORT DOCUMENTATION PAGE
11. TITLE (Include Security Classification)
Parallel-Concurrent Versus Concurrent Fault Simulation
12. PERSONAL AUTHOR(S) Saab, Daniel G.; Rahmeh, Joseph T. ; Hajj, Ibrahim N.
13a. TYPE OF REPORT 
Technical
13b. TIME COVERED 
FROM TO
14. DATE OF REPORT (Year, Month, Day)December 1987 15. PAGE COUNT43
16. SUPPLEMENTARY NOTATION
17. COSATI CODES
FIELD GROUP SUB-GROUP
18. SUBJECT TERMS (Continue on reverse if necessary and identify by block number)
concurrent fault simulation, parallel fault simulation, 
fault grouping, switch-level models
19 ABSTRACT (Continue on reverse if necessary and identify by block number)
A fault simulation algorithm based on the partitioning of faults into groups, with the group 
size equal to the number of bits in the host computer word, is presented. The fault effects of a par­
ticular group are evaluated using parallel fault simulation techniques and propagated using con­
current fault simulation techniques. The speed of the algorithm depends on the circuit and on the 
fault-grouping criterion. Three static grouping criteria are examined and compared in terms of 
speed and memory requirements. A dynamic regrouping technique is developed and is shown to 
improve the performance of static grouping.
20. DISTRIBUTION/AVAILABILITY OF ABSTRACT
E l  UNCLASSIFIED/UNUMITED □  SAME AS RPT. □  OTIC USERS
21. ABSTRACT SECURITY CLASSIFICATION
Unclassified
22a. NAME OF RESPONSIBLE INDIVIDUAL 22b. TELEPHONE (Include Area Code) 22c. OFFICE SYMBOL
DD FORM 1 4 7 3 ,8 4  MAR 83 APR edition may be used until exhausted. 
All other editions are obsolete.
SECURITY CLASSIFICATION OF THIS PAGf
UNCLASSIFIED
UNCLASSIFIED
SECURITY CLASSIFICATION OF THIS RAOE
UNCLASSIFIED
S E C U R IT Y  C L A S S IF IC A T IO N  O F  THIS P AG E
Parallel-Concurrent Versus Concurrent Fault Simulation *
Daniel G. Saab, Joseph T. Rahmeh, and Ibrahim N. Hajj
Coordinated Science Laboratory 
and Department of Electrical and Computer Engineering 
University of Illinois at Urbana-Champaign 
1101 W. Springfield Avenue,
Urbana, Illinois 61801.
ABSTRACT
A fault simulation algorithm based on the partitioning of faults into groups, with the group 
size equal to the number of bits in the host computer word, is presented. The fault effects of a par­
ticular group are evaluated using parallel fault simulation techniques and propagated using con­
current fault simulation techniques. The speed of the algorithm depends on the circuit and on the 
fault-grouping criterion. Three static grouping criteria are examined and compared in terms of 
speed and memory requirements. A dynamic regrouping technique is developed and is shown to 
improve the performance of static grouping.
t This work was supported by the Semiconductor Research Corporation Contract 86-12-109
ii
TABLE OF CONTENTS
CHAPTER PAGE
1. Introduction.....................................................................................................................................  1
2. Symbolic Logic Representation of MOS Circuits and Fault Modeling.......................................  4
3. Parallel And Concurrent Fault Simulation .................................................................................  9
3.1. Parallel fault simulation..............................................................................................  9
3.2. Concurrent fault simulation ........................................................................................ 12
4. Parallel-Concurrent Fault Simulation.......................................................................................... 14
4.1. Introduction................................................................................................................... 14
4.2. Parallel-Concurrent fault simulation algorithm ........................................................  16
4.2.1. Definitions and notations..................................................................... 16
4.2.2. Algorithm ............................................................................................. 17
4.3. Fault grouping criteria .................................................................................................  19
4.4. Fault regrouping ............................................................   21
5. Implementation ..............................................................................................................................  24
5.1. Examples.......................................................................................................................  24
6. Discussion ........................................................................................................................................  38
7. References ..............................................................................................................................   39
LIST OF FIGURES
FIGURE PAGE
Fig. 2.1. A transistor circuit and its corresponding graph................................................................  5
Fig. 3.1. Data representation for parallel simulation........................................................................  10
Fig. 4.1. A parallel-concurrent fault simulation algorithm example..............................................  14
Fig. .4.7. A circuit graph model; 1 is determined by the DFS algorithm.......................................... 21
Fig. 5.1. Total fault effect list lengths versus number of tests for circuit A.................................  24
Fig. 5.2. CPU time versus number of tests for circuit A............................................................. —• 26
Fig. 5.3. Maximum memory usage versus number of tests for circuit A...........................  26
Fig. 5.4. Number of faulty events versus number of tests for circuit A...........................  26
Fig. 5.5. Number of faulty events versus number of test for circuit B.........................................  26
Fig. 5.6. Maximum memory usage versus number of tests for circuit B............................ 26
Fig. 5.7. CPU time versus number of test for circuit B................................................................... 26
Fig. 5.8. Total fault effect list lengths versus number of tests for circuit B.................................. 26
Fig. 5.9. Total fault list lengths with and without regrouping (circuit A).................................... 26
Fig. 5.10. CPU time with and without regrouping (circuit A)........................................................  37
11. Introduction
Fault simulation is an integral part of logic circuit design. It involves the determination of 
the effects of physical failures on the behavior of digital circuits and the grading of the quality of 
test sets designed to detect the presence of such failures. A circuit model and a fault model map 
respectively the circuit and the failures that might occur in it into representations suitable for 
analysis and simulation. A gate level representation is an example of a circuit model where the cir­
cuit components are idealized as Boolean gates, and the stuck-at representation is an example of a 
fault model where the failures are idealized as constant-value nodes in the circuit.
Fault simulation is used in test generation to determine the fault coverage of a test set. Given 
a circuit, a set of faults in the circuit together with a set of tests (or input stimuli), fault simula­
tion is used to decide which faults will be detected by which tests. New test vectors are then gen­
erated to cover the undetected faults and fault simulation is performed again to determine the 
fitness of the new vectors. The cycle of test generation and fault simulation is repeated until a 
satisfactory test set is obtained. In very large-scale integrated (VLSI) circuits, the set of faults is 
very large, thereby, stressing the importance of fast and efficient fault simulators. This fact has led 
to the development of special purpose architectures for fault simulation . [1-3] These accelerators 
are designed to take into account the concurrencies that exist in fault simulation. In fact, there are 
at least three types of concurrencies in fault simulation; logic circuit concurrency, algorithm con­
currency. and fault activity concurrency [2]. The first two types can be exploited through distri­
buted processing and pipelined architectures. In this paper we will be concerned with the third 
type of concurrency and its implementation on a serial computer. The techniques may also be 
applicable in conjunction with distributed and parallel processing implementation.
• 2
Traditionally, circuits in fault simulation have been modeled at the logic gate level [4-7,1- 
2,8-9], and failures have been approximated using the classical stuck-at fault model. A number of 
algorithms have been developed for efficient gate level fault simulation under the stuck-at fault 
model. Among these algorithms are the parallel [5], deductive [6], concurrent [7], and the parallel­
valued list [8-9]. However, it has been established that the stuck-at fault model is not sufficient to 
describe the effects of physical failures on the behavior of digital circuits [10-11]. This fact has 
spurred interest in performing fault simulation at the transistor level (as opposed to the Boolean 
gate level) and in using alternatives to the stuck-at fault model [12-18].
In this paper a method which combines parallel and concurrent techniques for fault simula­
tion of Metal Oxide Semiconductor (MOS) transistor circuits is presented. The method uses 
switch-level transistor models[l8] , and utilizes the approach of fault modeling and logic expres­
sion generation described in [16], Although the proposed method is similar is some respects to the 
parallel-valued list described in [8-9], it is different from it in the sense that the new method is 
applicable at the transistor level, while the method in [8-9] uses gate-level models. In addition, in 
[8-9] two different fault propagation techniques are used. The first technique handles the case 
when the fault-free logic state of a node changes by relying on set-union and set-intersection opera­
tions similar to those found in deductive fault simulation. As a result, each time there is a change 
in the fault-free circuit, the fault list on the node where the change occurred is dropped and a new 
list is constructed. The second technique applies when the change in the logic state does not involve 
the fault-free circuit but rather the faulty one; in this case the faulty circuit is evaluated in a way 
similar to concurrent fault simulation. The application of two different propagation techniques 
tends to reduce the efficiency of the algorithm. In our case a single propagation technique is used. 
Moreover, the speed and efficiency of the method described in [8-9] as well as the one presented 
here depend on the grouping of faults. However, the only grouping guideline given in [8-9] is based 
on the user's node numbering and amounts to random fault grouping. In this paper three different
3grouping criteria are investigated; namely, vertical grouping, horizontal grouping, and random
r
grouping and their effects on the simulation speeds and on the length of the propagated fault lists 
are studied. In addition, the horizontal regrouping criterion is shown to reduce the cpu time, the 
length of the fault effect lists, and the number of faulty events more than the other two.
The remainder of the paper is organized as follows. In section two the symbolic representa­
tion of MOS circuits is reviewed. In section three the parallel and the concurrent fault simulation 
techniques are outlined separately. In section four the parallel-concurrent technique is presented. 
Section five contains the implementation and performance details, and section six contains the con­
clusion.
42. Symbolic Logic Representation of MOS Circuits and Fault Modeling
In this section the representation of the circuit in terms of logic expressions is described. Cir­
cuits formed by the interconnection of MOS (both nMOS and pMOS) transistors are considered. A 
transistor is modeled as a three node (source, drain, and gate) device. All transistors act as 
voltage-controlled switches which can be in one of three states: on, off, and undefined. The nodes 
of the circuit may assume one of three values: high (1), low (0), or undefined (X). An nMOS 
transistor is on when its gate is high, off when its gate is low and undefined when its gate is 
undefined. A pMOS transistor assumes the opposite state of an nMOS transistor under the same 
gate state.
All transistors are bidirectional elements (i.e. source and drain nodes are symmetrical). The 
circuit is partitioned into subcircuits, where a subcircuit is a maximal set of transistors and their 
source and drain nodes, such that if all the transistors in the subcircuit are conducting, there is at 
least one path (which has no power or input node as an inside vertex) between any two nodes in 
the subcircuit. Such subcircuits are also referred to as strongly connected or dc-connected com­
ponents. Interactions between two subcircuits can occur only if a node in one is the gate of a 
transistor in the other. A labeled weighted graph G^ViE^ is constructed from subcircuit i as fol­
lows:
(1) Each node is represented with a vertex which is assigned a unique label. The vertices labels 
become the logic variables in the logic expressions representing the subcircuit. Hereafter, 
there will be no distinction between a vertex and its label. Vj is the set of vertices.
(2) Each transistor is represented with an edge joining the vertices representing its source and 
drain nodes. Ej is the set of edges.
5(3) An edge representing a transistor is labeled vit*) in the case of an nMOS transistor and vU*) in 
the case of a pMOS transistor; v  being the vertex representing the gate of that transistor and tj 
is the name of transistor, t* assumes one of three values 0. 1 or N indicating that transistor tj 
is stuck-open, stuck-on. or has no faults respectively. Where vOtj) is define as follow:
ti if t* =  0 or 1 
v otherwise
(4) An extra edge is added to model a potential short between two lines or a potential open in a 
line. Edges corresponding to a short between two lines are labeled with v0(ti), where v 0 is the 
vertex corresponding to Gnd. On the other hand, edges corresponding to open faults within 
lines are labeled with Vjitj). where v x is the vertex corresponding to Vdd.
(5) The weight of an edge is the width to length ratio of the corresponding transistor and it 
approximates the conductance of the transistor.
Definition: The path label of a simple path between vertices s and t is
Lst =  Lx A L2 A • • • A Le, where e is the number of edges in the path and where Lj is the label of the 
ith edge in the path [19].
Definition: The path transmission expression Tst between any two vertices s and t of the 
same subcircuit is Tst =  L^1 V Lst2 • • • V Lstp where p is the number of distinct simple paths 
between s and t and L - 1 is the path label of the ith simple path. Note that if Tst =  0 then all paths 
between s and t are open; if Tst =  1 . then there is at least one closed path connecting s to t.
Figure 2.1 illustrates the above concepts. For example, the path transmission expression between 
node P and the ground is:
Tvpvo =  [(vb(t2) Vvc(t4)) A va(t3) v] vCvb(t5) A Vc(t6)] V
[(vb(t2) Wc(t4)) A v0(t7) A v c(t6)] V 
[v5(ts) A vc(t6) A v0(t7>]
( 1)
6Fig. 2.1. A transistor circuit and its corresponding graph.
Note that an extra branch t7 had been added to between node 7 and 8 to model potential short 
between the nodes.
The method for generating logic expressions from transistor circuit descriptions is explained in 
[19] and [20] and is based on the switch-level transistor model proposed in [21]. The circuit is par­
titioned into an interconnection of subcircuits, and the nodes of each subcircuit are classified into 
different groups in the following descending order of strength: primary input nodes (including 
power and ground) and normal nodes which are in turn classified into different strength groups.
Consider a normal node n in a subcircuit, to derive the logic expression associated with node 
n. first the nodes of the subcircuit are placed into strength sets M j > M 2>  • * •  >M r, where >  
denotes stronger than and where all the nodes of the same strength are placed in the same set. 
Then, the state of node n is computed by determining the interaction between n and the remaining 
nodes in the subcircuit starting with the nodes in the strongest group. Mx. If a set of nodes has no
7effect on n, the next (weaker) set is considered; otherwise, the evaluation is completed without 
having to consider the effects of the remaining sets. If none of the sets have any effects on n, then 
n retains its state. The effects of the nodes in the set on n is computed as follows:
(1) Let Fx =  [a ! at2„ * * * a pF  be a vector of path transmission expressions, where p is the number 
of nodes in M* and a k is the path transmission expression TVfcn between the kth node, v k, in 
Mi and n. TVkn is computed in the the subgraph GiV—Va^E—Ea) which is obtained from 
G(V,E) (the graph of the subcircuit) by removing the subgraph induced by the vertex set Va. 
Va being the set of either all input nodes except v k, if vk is an input node, or the set of all 
nodes in Mi stronger than v k.
(2) Let F2 =  [fii 02 * * * 0PF  be the vector obtained from Fx such that 0k =  a kvk. The effects of 
Mi on n can be found by evaluating
f t M ^ )  = iFj/lFzl (2)
where |F|| and N  denote the number of nonzero elements in Fi and F2 counting any
unknown (X) entry as nonzero, and ’/ ’ is a ’comparison’ operator [19] defined as follows:
(0 if |f ,| =  |f 2| *  o. then all nodes in M5 which are connected by closed paths to n are all 1 and 
f(Mi^n) =  1.
(ii) If IFj I 0 and |F2J =  0, then the nodes in Mi which are connected to n are all 0 and
f(M i^i) =  0.
(iii) If jFiJ ^  0, |F2| 0 and |Fi| 5* |F2J, then the nodes in Mi which are connected to n are in
conflict and f(M itn) =  X.
(iv) If |Fi| =  0 . then all paths from Mi to n are open and the effects on n of the next nodes in the 
hierarchy are to be evaluated.
In order to take into account the effects of all the nodes in the subcircuit on n, a general function is
defined:
8f(n) =  f(M x,n) V  f(M 2,n) • • • V f(Mj^n) (3)
where the symbol ’V is called the resolve operator; it simply links the functions that describe the 
effects of the nodes at the different strength levels on n. The operator 'V ’ is activated only when 
condition (iv) above is encountered, namely when |FX| =  0.
For any giving input sequence and initial conditions, the expression (3) is evaluated from left 
to right in a hierarchical fashion. The evaluation moves to the right whenever an evaluation of (2) 
produces results satisfying condition (iv), otherwise the evaluation stops. If at any point in the 
evaluation of |Fj| one of its component a k is unknown state X. the corresponding level may have to 
be evaluated twice [19]. First, the unknowns a k’s are all put equal to 1; the value of (2) could 
then be equal to 0,1 or X. If it is X, then f(n)= X; if it is 0 or 1, the expression is reevaluated with 
all unknown a k’s equal to 0. The result now may be 0, 1 or satisfying condition (iv). If it satisfies 
condition (iv), the next expression level in the hierarchy is evaluated. If the subsequent results 
match the first, the state of n is found ; otherwise f(n) = X. This evaluation procedure per iteration 
is guaranteed to terminate since Tnn=l.
Consider the nMOS circuit shown in Figure 2.1: and extra branch is labeled t7 is added to 
model potential short between nodes 7 and 8. Assuming that P is stronger than node 7 and 8, the 
logical expression at P is given by;
f(v p)=[TVoVp]/0 ]v [v 1(t1)/v1(t1)]V[vp/l]
where TV()Vp is given in (1)
To deduce the state of P in the fault-free circuit set tj =  N, i= l ,  * * • ,7. When, simulating the 
effect of any one of the faults the corresponding is set either 0 or 1. For example, if the load 
transistor is stuck open, then t x =  0 and the above expression treats P as a normal node.
93. Parallel And Concurrent Fault Simulation
In this section, the parallel and the concurrent fault simulation algorithms are briefly 
reviewed. In the description of these algorithms, a "faulty machine" refers to the results of inject­
ing one fault in the fault free circuit. A "faulty machine" is then compared against the fault-free 
circuit or the "good machine" for each input test vector. If the output of the faulty machine differs 
from that of the good machine for a particular test vector, then that test vector detects the 
corresponding fault. In serial fault simulation, the faulty machines are simulated one at a time, 
thus requiring f  simulation passes for f  faulty machines, where a simulation pass consists of 
applying all the test vectors to the machine being simulated. In parallel fault simulation, described
next, a group of n faulty machines is simulated during each pass thus requiring passes.
Although parallel and concurrent simulation algorithms are originally developed for gate level 
representations with stuck-at fault models. The techniques can be applied as well to transistor 
switch-level fault simulation. In this case a logic gate in the following stands for a dc-connected 
subcircuit.
3.1. Parallel fau lt simulation
Parallel fault simulation, the oldest high performance algorithm for fault simulation and the 
first algorithm to simulate a number of faulty machines simultaneously by taking advantage of 
word-oriented operations on the host computer [5]; it consists of the following steps:
(1) The faults are partitioned into groups of size n. The faults in one group are simulated simul-
taneously.
10
(2) Each node has associated with it n+1 logic values, one value for the good machine and n 
values for the faulty machines. The n+1 logic values are stored in consecutive bits in m con­
secutive computer words.
(3) When a logic gate is evaluated, logical word operations are performed on its inputs (m words 
each), thus, evaluating the gate in the good machine and in the n faulty machines all at the 
same time.
To perform a multi-valued simulation, two (or more) bits per value are needed and are usu­
ally stored at the same bit position in two (or more) groups of m words each as illustrated in Fig­
ure 3.1. It is not clear how large n should be for optimal use of this algorithm and it is not evident 
that larger values of n would result in a faster simulation. Increasing n would increase the 
number of faulty machines represented at each node; however, not all of the represented faults 
need affect a particular node. Therefore, increasing n would increase the likely hood of useless 
faulty machine evaluations. Hereafter, it is assumed that n+1 is equal to the number of bits in one 
computer word (e.g. m =  1). It is also assumed that three-valued simulation (zero, one, and 
undefined) is performed.
m words
n n-1 n-2 2 1 0
0 0 0 ... 0 1 0
1 1 0 ... 0 1 1
Fig. 3.1. Data representation for parallel simulation.
11
Since three logic states are used two bits are needed per logical value resulting in two com­
puter words per group of faults. A coding method is then needed to implement parallel simulation 
as follows:
(1) The states of a node in n+1 machines is represented by two words, Ax and A2. The fault-free 
machine is labeled machine zero and the faulty machines are labeled machines 1 to n respec­
tively. The state of the node in machine i is encoded by a pair of bits (A ^ilA ^i]) where R[i] 
denotes the ith bit of word R. The codes (0,1). (1,0), and (0,0) denote the logical states of zero, 
one, and undefined, respectively [22].
(2) The logical AND operation C =  AAB.  where C =  (Cj,C2), A =  (AlfA2), and B =  (Bj,B2), is 
defined in terms of the components of A and B as follows:
Q  — Aj V Bj
C2 — A2 A B2
(3) The OR operation C =  A V B is similarly defined:
Cj =  Aj A Bj
C2 := A2 V B2
(4) And, finally, the complement operation C =  A is defined by:
Q = A 2 
Q  =  A*.
The parallel fault simulation algorithm is very efficient in computer memory usage and is 
simple and relatively easy to implement. However, when the number of faults is large, many 
simulation runs are needed to simulate all the faults, resulting in a significant increase in the com­
putational requirements. This fact led to the development of concurrent fault simulation, which is
described next.
12
3.2. Concurrent fault simulation
In concurrent fault simulation [23], all the faulty machines are simulated in one pass together 
with the good machine. To avoid duplicating the circuit description, only the difference between a 
particular faulty machine and the good one is recorded at any particular time. This is achieved by 
associating with each node in the circuit the state of the good machine at that node and a list of 
fault effects (states of faulty machines) for those faulty machines in which the state of the node 
differs from its state in the good machine.
Concurrent fault simulation is based on the event driven simulation paradigm [4] where a 
change in the logic value of a node constitutes an event and causes that node to be placed in an 
"event queue". The simulation progresses through discrete time steps by handling all the events at 
the "current time" and then advancing the simulation clock. The simulation starts by applying a 
vector to the primary input nodes of the circuit which causes a subset of these nodes to be placed 
on the event queue. Events removed from the event queue are processed as follows:
(1) If the event results from a change in the state of a node in the good machine (good event), 
then all the elements (gates or subcircuits) having that node as input are evaluated. A change 
in an output node of any such element causes that node to be scheduled at the appropriate 
time (the current time plus the delay of the element).
(2) Events from a faulty machine (faulty events) are handled similarly with the state of the 
node taken from the fault effect list.
(3) When evaluating an element activated by a good event, any fault effect on the input nodes of 
the element is propagated to the output if the fault causes the state of output to differ from 
its fault-free value.
(4) If the state of a node in the good machine becomes identical to that in a faulty machine, then 
the corresponding fault effect is dropped from the fault effect list on that node.
13
The advantage of concurrent fault simulation is its speed which results from considering only 
the active faults in the circuit. However, if the number of active faults is relatively large then the 
speed degrades due to the overhead incurred from the maintenance of the fault effect lists [20]. 
Another drawback of concurrent fault simulation is its unpredictable memory requirement.
14
4. Parallel-Concurrent Fault Simulation
4.1. Introduction
There are at least three types of concurrencies in fault simulation: logic circuit concurrency, 
algorithm concurrency, and fault activity concurrency [2], The first two types can be exploited 
through distributed processing and pipelined parallel architectures. The third is useful in 
software-based simulators where a number of faults are simulated simultaneously. We will be 
concerned with the third type. We will combine the best features of parallel and concurrent fault 
simulation algorithms described in section 2 which we call parallel-concurrent which runs on a 
serial computers. In addition, we extend the circuit models to include switch-level circuit and 
fault models [16,18].
The parallel-concurrent fault simulation algorithm is based on collecting faults into groups of 
size w each, the size of the host computer word. The fault effects of a particular group are 
evaluated using parallel fault simulation techniques and propagated using concurrent fault simula­
tion techniques. If at least one fault of a particular group causes a given node to differ from its 
fault-free state, then the group effect is included in the fault effect list at that node. Note that 
some faults in a group may not be active: these faults are noted with a special code.
To illustrate the parallel-concurrent fault simulation algorithm, consider the example in Fig­
ure 4.1. In this example a three valued logic is used. Figure 4.1(a) is a portion of a circuit in 
steady state with faults present in the circuit as shown. Note that each fault effect is a group of 
four faults. In each group there is at least one faulty state which is different from the fault-free 
state. We consider each gate in the figure to be represented by a switch-level circuit model, and the 
fault groups at the input/output nodes of the gates to include switch-level models internal to the
15
(a)
(b)
Fig. 4.1. A parallel-concurrent fault simulation algorithm example.
gates as well as faults in the interconnections between the gates. A Code of ( l , l )  indicates an
•16
inactive fault, as shown on the list of input ’a ’ (fault 3 and 4 of group g l). The code (1,1) is used 
during the evaluation of the gate to decide whether to take the state from the faulty record or from 
the fault-free state of the line.
Figure 4.1(b) shows the same circuit with state of input ’a ’ changed from logic low to logic 
high and state of input'd’ changed from logic high to logic low. This change causes gate G l and G2 
to be evaluated. When evaluating G l all faults at the inputs are propagated to the output if they 
cause the state of the output node to be different from its fault-free state. Note that groups g l, g3 
and g4 cause the logical state of the output node to be different from the fault-free state. Thus g l, 
g3 and g4 are present at the output node. On the other hand, node ’f  ’ has not changed its fault-free 
state. However, the state of node ’f ’ under faults in group g l and g3 is the same as the logical state 
in the fault-free. Therefore, both groups are deleted from the fault effect list on node ’f*. Now 
gate G3 will be scheduled and will settle to a steady state as shown. In the following we describe 
in detail the parallel-concurrent fault simulation algorithm.
4.2. Parallel-Concurrent fault simulation algorithm
4.2.1. Definitions and notations
To explain the implementation of the parallel-concurrent fault simulation algorithm, the fol­
lowing definitions are given:
group id(f): 
fanout(n): 
fanin(n): 
group effect:
Indicates the group fault which f belongs to.
The set of nodes which n fanin to.
The set of nodes which fanin to n.
The status of a node in the fault machines 
corresponding to a group of faults.
The fault effect list (list of group effects at node n).fel(n):
17
4.2.2. Algorithm
The algorithm is based on scheduling and evaluating the extracted logic expressions [19] 
according to the flow of signals and the activity of the faults. Other switch-level evaluation 
techniques[18,24-26] can also be used. Fault lists are constructed at the input and output nodes of 
the subcircuits and propagated during the simulation process. The fault simulation algorithm is 
presented as a set of procedures written in a PASCAL-like high level language.
Procedure "FAULT_SIMM, shown in Figure 4.2. describes the fault simulation. It takes as 
input a list L of nodes that have changed logical states. First, it updates the logical state of every 
node in L and schedules their fanout. All the scheduled nodes are marked as being generated by 
changes in the fault-free circuit so as to distinguish faulty events from good ones. Each event on 
the queue is tagged with a group id which identifies the group where activities take place. The 
group id can refer either to the fault-free machine or to some group of faulty machines. Events are
procedure FAULT_SIM ( L ) 
reset event queue; 
for each < n ,s>  € L do 
STATEC n ) «- s 
schedule fanout( n ) 
end for
while queue not empty do
remove event <n ,m > from queue 
if  event m is fault-free
s -  EVALUATE_FAULT_FREE( n ) 
PROPAGATE_FAULTS( n. s )
else
EVALUATE_FAULTY( n. m ) 
end if  
end while 
end procedure
Fig. 4.2. Top level fault simulation procedure
18
removed from the queue and processed as follows:
(1) A good event causes the state of its node to be computed by direct evaluation of the 
corresponding symbolic expression (EVALUATE_FAULT_FREE procedure). Next, the 
effects of any group at the fanin of the node is propagated (procedure 
PROP AG ATE_F AULTS) if the state of at least one faulty machine in that group is different 
from the good machine.
(2) A faulty event results in the evaluation of the effect of the corresponding group 
(FAULT_EVALUATE) and in the propagation of that effect to the fanout of the node.
Procedure PROPAGATE_FAULTS (Figure 4.3) takes as input a node n and its logical fault- 
free state s. If the fault-free state is undefined, then propagation of fault effects will not be done
procedure PROPAGATE_FAULTS( n, s ) 
i f  s < >  U
for each w in fanin(n) do
for each fault effect f in fel(w) do 
i f  f is not detected
faulty_state(n) «- EVAL_SYMBOLIC(n. group_id(f)): 
i f  faulty_state(n) 5^  good_state(n) 
schedule( fanout(n). group_id(f) ) 
add f to fel(n) 
else
remove f from fel(n) 
end i f  
else
remove f from fel(n) 
end if
end if
if  s 5* good_state(n) 
good_state(n) «— s 
schedule( fanout(n). fault_free_id) 
end if
end procedure
Fig. 4.3. One step fault propagation procedure.
19
because faults are propagated only when they cause the state at a node to be the complement of its 
fault-free value. On the other hand, when the state of the node is well defined, which means that 
the logical state of the node is either 0 or 1, then any fault that causes the states of the inputs and 
outputs to be different from their fault-free values results in the addition of the corresponding 
group of faults to the fault effect list at node n. If the state of a group becomes identical to the 
good machine, then the group effect is deleted.
Procedure FAULT_EVALUATE shown in Figure 4.4 takes as input a node n and a group id 
m. The logical state of node n is then evaluated in the presence of any active faults in group m. If 
after the evaluation, all the faults in the group are either inactive or cause the node to assume a 
state that is identical to the previous state of node n under group m, then the group effect is not 
propagated; otherwise, the group effect at node n is updated and the fanout of n is scheduled for 
later evaluation in group m.
4.3. Fault grouping criteria
The computational efficiency of the above algorithm depends on the way the faults are 
grouped together. Three different grouping techniques are investigated. The first criterion levelizes
procedure FAULT_EVALUATE( n, m ) 
i f  any fault in group m is not detected
new_faulty_state «- EVAL_SYMBOLIC( n, m ) 
if  new_faulty_state != fault_state( n ) 
schedule( fanout(n), m ) 
add m to fel(n) 
end if  
end if
end procedure
Fig. 4.4. Fault evaluation procedure.
20
the circuit by assigning a non-negative integer to each circuit element according to its distance from 
primary circuit inputs. Procedure LEVELIZE, shown in Figure 4.5, assigns levels starting from 
primary input with level zero: the level of primary input fanout will be one. The level of a non 
primary input node is one higher than the minimum level of its fanin nodes. After levelizing, 
faults are partitioned into groups where each group consists of faults from the same level unless 
the faults at a given level are depleted while a group is not completely filled, in that case, faults 
from the next level are assigned to the partially filled group. This fault grouping is called levelized 
grouping.
The second grouping technique levelizes the circuit horizontally and is called horizontal 
grouping. Starting from a primary input node, a depth-first search (DFS) [27] is applied to the cir­
cuit. During the DFS. a node is assigned level k if it was visited at the kth step of the traversal. 
This procedure is repeated for every primary input. This partitioning can be achieved by the recur­
sive procedure shown in Figure 4.6. This procedure is called once for every primary input. Faults 
are then gathered into groups as described above. This type of grouping will be called horizontal
procedure LEVELIZE( network. E )
LVL -  0
while ( E 5** empty ) do 
D «— empty 
LVL «- LVL + 1 
for ( each node n in E ) do 
i f  n is never assigned a level 
level(n) -  LVL 
E « - E - { n }
D *- D IJ fanout(n) 
end for 
E -  D 
end while 
end procedure
Fig. 4.5. Levelize grouping
21
grouping.
To illustrate the above partitioning procedure consider Figure 4.7; the V  label is the original 
node number and the 1 label is the resulting labeling of procedure HORIZONTAL. This procedure 
was called three times with ’v = l\  ’v=2‘ and 'v *3 ’ as primary input nodes, respectively.
The third grouping criterion randomly assigned faults to groups, i.e. faults are assigned to 
groups as they are read from the input file.
4.4. Fault regrouping
When a fault causes the state of primary output to be different from its fault-free state, the 
fault is said to be detected. In concurrent fault simulation, faults are dropped as soon as they are 
detected. However, in the parallel-concurrent technique detected faults create empty places or
procedure HORIZONTAL( node ) 
if  node is not marked
if  node never assigned a level
assign current level to node 
increment current level
end if
mark node
for each w in fanoutC node ) do 
i f  w is not marked
for each e in fanin(w) do 
i f  e never assigned a level 
assign current level to e 
increment current level 
end if  
end for
HORIZONTALC w ) 
end if  
end for 
end if
end procedure
Fig. 4.6. Horizontal partitioning
22
v = l ,Z = l  v =4, l =2 v =6, Z =3 v =10, Z =4
= 11. ¿=5
= 12,1=7
Fig. 4.7. A circuit graph model; 1 is determined by the DFS algorithm.
holes in the particular group they are assigned to and they can be eliminated from further con­
sideration if and only if all the other faults in that group are detected. In addition, an undetected 
fault forces the remaining faults in the group, detected or otherwise, to be represented every time 
the undetected fault is activated and placed in the fault effect list. In the worst case, a group may 
have a single undetected fault. Obviously, extra computation is required to identify the detected 
faults and extra memory is needed to represent them. To overcome this problem regrouping of 
undetected faults during the simulation can be performed by reassigning the undetected faults to 
groups such that every group consists entirely of undetected faults. During the regrouping, we fol­
low the original grouping criterion as close as possible. Namely, faults in a group are taken from
23
one level or the next closest level. With this in mind the regrouping results in even shorter fault 
lists, less faulty events and less cpu time than parallel-concurrent with no regrouping, as we show 
in the next section.
Assume that information about faults are stored in a table T where each entry of T is a record 
containing the fault name and a flag to indicate whether or not the fault has been detected. Table T 
is indexed by the fault id. Procedure REGROUP shown in Figure 4.8 takes as input a fault table T 
and its size and regroups the faults as follows. Starting from the lowest index in the table, every 
detected fault is swapped with the next higher indexed undetected fault. This process is repeated 
until the highest index of the table is reached. Note that with this compacting procedure the group­
ing criteria stated above are obeyed.
procedure REGROUP( T, size ) 
offset «- 0 ; / *  set the offset to 0 V 
for old__id = 0 to size do
if  (T[ old_id J.detected ) 
offset «— offset + 1 ; 
else
T[ old_id - Offset ] «- T[ old_id ] ;  
cross_ref[ old_id ] «- old_id - offset; 
end if  
end for 
end procedure
Fig. 4.8. Fault regrouping algorithm
24
5. Implementation
The above approach has been implemented in a computer program for logic and fault simula­
tion of MOS circuits. The program constructs a switch-graph model of the circuit from the circuit 
description. The program then performs circuit partitioning, identifies strongly-connected com­
ponents using a depth-first search algorithm [27], and sets up an analysis sequencing procedure 
which implicitly exploits the latency properties of the subcircuits during the simulation. Symbolic 
logic expressions are then generated at the outputs of the subcircuits in terms of subcircuit inputs 
and initial conditions. Faults are then injected into the expressions and regrouped according to one 
of the three criteria described above. Fault simulation is then performed using the logic expressions 
and an event scheduler. The scheduler takes advantage of the sequencing to exploit latency.
5.1. Examples
To illustrate the effect of the above grouping on various simulation variables, the simulation 
of two typical circuits circuit will be presented. The first circuit, referred to as circuit A, consists 
of two 16 bit counters feeding into a 32 bit carry-lookahead adder. The circuits contains 4000 
transistors and 1991 faults and is subjucted to 875 test vectors. The second circuit, refered to as 
circuit B, is a 32 bit carry-lookahead adder consisting of 1350 transistors and 897 faults and is 
tested with 675 test vectors. Concurrent and parallel-concurrent fault simulation is performed on 
each circuit. Parallel-concurrent fault simulation is performed with three different fault list 
grouping: levelized. horizontal, and random.
Figure 5.1 shows the total size of the fault effect list vesus the number of test vectors applied 
to circuit A. Note that the random grouping creates faulty copies which are almost half way 
between the levelized and the horizontal grouping, with the horizontal grouping creating less faulty
25
4 0 0 0 -
30 0 0 -
□ horizontal 
a random 
x levelize 
• concurrent
2000 -
1000-
0 500
Number of test vectors
1000
Fig. 5.1. Total fault eifect list lengths versus number of tests for circuit A.
26
copies than the other two. In comparison, the concurrent algorithm creates much longer fault effect 
lists than parallel-concurrent. This is expected since in parallel-concurrent fault simulation if 
more than one fault from a particular group are active their activities will be propagated in a single 
record, while in concurrent fault simulation their activities will be propagated in as many records 
as the number of active faults. It is clear that a reduction in number of faulty copies reduces the 
time for traversing them; this in turn reduces the evaluation time consequently the overall simula­
tion time, as shown in Figure 5.2. In addition, the reduction in the number of faulty copies reduces 
the memory requirement, as shown in Figure 5.3. The memory required for concurrent fault simu­
lation is 66 blocks, while the parallel-concurrent with horizontal grouping requires only 48 blocks. 
In addition, the number of faulty events for horizontal grouping is much less than any of the other 
groupings and considerably less than the faulty events created by concurrent fault simulation, as 
shown in Figure 5.4.
Figure 5.5 shows the number of faulty copies created as a function of the number of test vec­
tors applied to circuit B. Note that the results are similar to the ones obtained from circuit A in 
terms of memory requirements ( Figure 5.6 ) and overall CPU time ( Figure 5.7 ). and in terms of 
the number of faulty copies created during the simulation (Figure 5.8 ).
From these examples one can conclude that parallel-concurrent saves both time and memory 
over concurrent fault simulation and that the best grouping is the horizontal grouping. This is 
because horizontal grouping tends to include the most likely equivalent faults in one group, as can 
be deduced by the number of faulty events created shown in Figure 5.4 and 5.5. In addition by 
regrouping the faults, the performance can be further improved.
0
Figure 5.9 shows the effect of regrouping on the fault effect list length for circuit A. the 
unmarked curve shows the length of the fault effect list with horizontal grouping, while the curves 
marked with squares, circles, and triangles show the fault effect list lengths for horizontal with
27
CPU time 
(msec.)
Fig. 5.2. CPU time versus number of tests for circuit A.
28.
Maximum
memory
usage
(Kbytes)
Number of test vectors
Fig. 5.3. Maximum memory usage versus number of tests for circuit A.
29
Fig. 5.4. Number of faulty events versus number of tests for circuit A.
30
Number of
faulty events 200000 -
150000 -
□ horizontal y' 
a random X  
x levelize ./
• concurrent S
100000 -
50000 -
0 -
! | j |—
0 200 400 600
Number of test vectors
Fig. 5.5. Number of faulty events versus number of test for circuit B.
31
Maximum
memory
usage
(Kbytes)
Number of test vectors
Fig. 5.6. Maximum memory usage versus number of tests for circuit B.
32
CPU time 
(msec.) □ horizontal 
a  random 
x  levelize
200000 - • concurrent
150000 - /
100000 -
;  .  . .
50000 -
0 -
-----!---------------------1---------------------1---------------------1------------------
0 200 400 600
Number of test vectors
Fig. 5.7. CPU time versus number of test for circuit B.
33
List size
2500-
2000 -
1500
1000 -
5 0 0 -
0 -
200 400 600
Number of test vectors
Fig. 5.8. Total fault effect list lengths versus number of tests for circuit B.
34
List size
0 200 400 600
Number of test vectors
Fig. 5.9. Total fault list lengths with and without regrouping (circuit A).
35
regrouping after every 30 percent, 20 percent and 10 percent detection, respectively. Regrouping 
after 10 percent detection means that compaction of the fault table is initiated after 10 percent of 
the faults remaining in the circuit since the last compaction are detected. Compaction is carried out 
as long as the number of remaining fault is larger than a certain threshold. Note that the algorithm
36
CPU time 
(msec.)
Fig. 5.10. CPU time with and without regrouping (circuit A).
37
performs best at the 10 percent regrouping. This saves some cpu time as shown in Figure 5.10.
38
6. Discussion
The speed of the parallel-concurrent fault simulation algorithm depends on the circuit being 
simulated and on the way the faults are grouped. Three different grouping criteria are investigated: 
levelized grouping, random grouping and horizontal grouping. It is found that the best grouping is 
the horizontal partitioning which minimizes both the number of faulty copies and the memory 
required. Regrouping of faults does enhance the simulation speed and reduces the number of 
faulty copies created during the simulation. However, as a guideline, the faults chosen to be 
grouped together should be ’close’ so that they are active when the same test vectors are applied. 
The grouping which best fits this criterion is the horizontal grouping. It should be mentioned that 
faults corresponding to one dc-connected block should be grouped together, and since equivalent 
faults in the same group will cause no added overhead to the simulation one could ignore fault col­
lapsing and simulate all the faults. In any case, parallel-concurrent fault simulation algorithm 
gives better performance than concurrent fault simulation alone. This is due to the dense storage 
scheme used which reduces the overall memory required. The speed advantage also is due to the 
evaluation of active faults only.
39
7. References
[1] G. F. Pfister. "The Yorktown Simulation Engine: Introduction,” Proc. 19th Design 
Automation Conf., pp. 51-54, June 1982.
[2] Y. H. Levendel, P.R. Menon, and S.H. Patel. , "Parallel fault simulation using distributed 
processing,” The Bell Syst. Tech. J., vol. 62, pp. 3107-3137, December 1983.
[3] T. Blank, “ A Survey of hardware accelerators used in computer-aided design,” IEEE Design 
& Test, Aug 1984.
[4] M. A. Breuer and A. D. Friedman, Diagnosis and Reliable Design of Digital System. Potomac, 
MD: Computer Science Press, 1976.
[5] E.W.Thompson and S.A. Szygenda, “ Digital Logic Simulation in a Time-Based, Table-Driven 
Envirment: Part 2. Parallel Fault simulation,” Computer, vol. 8, pp. 38-49, March 1975.
[6] D. B. Armstrong, "A Deductive Method for Simulating Faults in Logic Circuits,” IEEE  
Trans. Comp., vol. C-21. pp. 464-71, May 1972.
[7] E. G. Ulrich et al., "High-speed concurrent fault simulation with vectors and scalars,” Proc. 
17th Design Automation Conf., pp. 374-380, June 1980.
[8] Kyushik. S., "Fault simulation with the parallel valued list algorithm,” VLSI Systems 
Design, pp. 36-43, Dec. 1985.
[9] P. R. Moorby, "Fault simulation using parallel valued lists,” in IEEE Int. Conf. on 
Computer-Aided Design, Santa Clara, Ca. pp. 101-102, 1983.
[10] M. A. Breuer, "The effects of races, delays, and delay faults on test generation.” IEEE Trans. 
Comput., vol. C-23, pp. 1078-1092. Oct. 1974.
[11] J. A. Abraham and H.-C. Shih, "Testing of MOS VLSI circuits,” in Proc 1985 Int. Symp. 
Circuits and Systems. Kyoto, Japan, pp. 1297-1300.
[12] M. R. Lightner and G. D. Hachtel, "Implication algorithms for MOS switch-level function 
macromodeling, implication and testing,” Proc. of the 19th Design Automation Conf., pp. 
691-698, June 1982.
[13] A. K. Bose, et. al., "A fault simulator for MOS LSI circuits,” Proc. of the 19th Design 
Automation Conf., pp. 400-408, June 1982.
[14] G. Ditlow, W. Donath, and A. Ruehli, "Logic equations for MOSFET circuits." in IEEE Int. 
Symp. on Circuits and Systems, Newport Beach, CA, pp. 752-755, May 1983.
[ 15] P. Banerjee and J. A. Abraham. "Fault characterization of VLSI MOS circuits,” in IEEE Int. 
Conf. on Circuits and Computers, New York, New York, pp. 564-568, Sept. 1983.
[16] I. N. Hajj and D. G. Saab, "Fault modeling and logic simulation of MOS VLSI circuits based 
on logic expression extraction,” in IEEE Int. Conf. on Computer Aided Design, Santa Clara. 
CA, pp. 99-100, Sept. 1983.
[17] Y. M. El-Zig, "Failure analysis and test generation for VLSI physical effects.” in 1983 
Custom Integrated circuit conference, Rochester, N. Y., pp. 300-303, May 1983.
40
[18] R. E. Bryant and M. D. Schuster, “Fault simulation of MOS digital circuits,” VLSI Design, 
pp. 24-30, Oct. 1983.
[19] D. G. Saab. “ Symbolic Switch-Level Logic and Fault Simulation of MOS VLSI Circuits,” in 
CSL report UILU-ENG-85-2231, University of Illinois at Urbana, September 1985.
[20] I.N. Hajj and D. Saab, "Symbolic logic representation and logic and fault simulation of MOS 
circuits,” submitted for publication..
[21] R. E. Bryant, “ An Algorithm for MOS Logic Simulation,” LAMBDA (now VLSI) Magazine, 
vol. 1, pp. 46-53, 1980.
[22] Y. H. Levendel and P. R. Menon, “ Fault-Simulation Methods - Extensions and Comparison,” 
Bell System Technical Journal, vol. 60, pp. 2235-2259, November 1981.
[23] E. G. Ulrich and T. Baker, "Concurrent Simulation of Nearly Identical Digital Networks,” 
Computer, vol. 7, pp. 39-44, April 1974.
[24] J.P. Hayes, “ Fault Modeling for Digital Integrated Circuits,” IEEE Trans, on Computer-Aided 
Design, vol. 2, pp. 202-208, July 1984.
[25] J.P. Hayes, “ Fault Modeling,” IEEE Design & Test, vol. 2, pp. 88-95, April 1985.
[26] R. H. Byrd. G. D. Hachtel, M. R. Lightner, and M. H. Heydemann, “Switch level simulation: 
models, theory, and algorithms,” in Advances in Computer-Aided Engineering Design, ed., 
A. L. Sangiovanni-Vincentelli. JAI Press Inc., pp. 93-148, 1985.
[27] R. Tarjan. “Depth-First Search and Linear Graph Algorithms,” SIAM J. on Computing, vol. 
1, pp. 146-160, 1972.
