I. INTRODUCTION
Fault Detection Probability (FDP) is an important testability measure that is useful for not only generating test patterns, but also to vet designs for random-input testability. Traditionally, FDP has been used for test point insertions, however, it can also be used as random single-event-transient (SET) sensitivity measure, which is important for characterization of impact of soft errors in logic blocks. We model single stuck-at-faults (errors) in large combinational circuits using a Logic Induced Fault Encoded Direct Acyclic Graph (LIFE-DAG) graph structure. We prove that such a DAG is a Bayesian Network. Bayesian Networks are graphical probabilistic models representing the joint probability function over a set of random variables. A Bayesian Network is a directed acyclic graphical structure (DAG), whose nodes describe random variables and node to node arcs denote direct causal dependencies. A directed link captures the direct cause and effect relationship between two random variables. In a LIFE-DAG, each node describes the state of a line and a directed edge quantifies the conditional probability of the state of a node given the states of its parents or its direct causes. Figure 1(a) gives the conditional probability table of an AND gate. The attractive feature of this graphical representation of the joint probability distribution is that not only does it make conditional dependency relationships among the nodes explicit but it also serves as a computational mechanism for efficient probabilistic updating. Probabilistic Bayesian Networks can be used not only to infer effects due to known causes (predictive problem) but also to infer possible causes for known effects (the backtracking or diagnosis problem). The diagnostic aspects of BNs makes it suitable for further use in test-pattern generators.
Fault Detection Probability (FDP) of a stuck-at fault f
We used two stochastic algorithms based on Importance sampling to compute the fault/error detection probability using the LIFE-DAG model. An importance sampling algorithm generates sample instantiations of the whole DAG network, i.e. for all lines in our case. These samples are then used to form the final estimates. At each node, this sampling is done according to an importance function that is chosen to be close to the actual joint probability function. Specifically, we explore two stochastic inference strategies: Probabilistic Logic Sampling (PLS) [7] and Evidence Pre-propagated Importance Sampling (EPIS) algorithm [6] , which are discussed in section V. It is worth pointing out that unlike simulative approaches that sample the inputs, importance sampling based procedures generate instantiations for the whole network, not just for the inputs.
We advance the state-of-the-art in fault analysis in terms of space and time requirements and in providing a uniform model.
A detailed discussion about time and space complexity is given in section V.
Most of the analysis of FDP was performed in 80's (as proposed by Seth et al. [14] , Wunderlich [17] , etc.) In a later development (1988), Bayesian network was introduced for probabilistic reasoning and belief propagation and it has been applied in artificial intelligence and image analysis. Recently, in [8] , [9] , switching probabilities and signal arrivals in VLSI circuits have been modeled using a Bayesian Network, however their use in estimation of error detection probabilities in digital logic is new. In this paper, we provide a fresher look into an old problem by adopting a novel and efficient scheme. However, this probabilistic FDP analysis technique can be applied to measure Single Event Transient (SET) sensitivity and soft error susceptibility [1] of logic circuits which are future nanotechnology challenges. We discuss this further in II.
II. MOTIVATION
When high-energy neutrons present in cosmic radiations and alpha particles from impurities in packaging materials hit a semiconductor device, they generate electron-hole pairs and cause a current pulse of very short duration, termed as Single Event Transient (SET). The effect of these SETs may be propagated to an output latch and cause a bit flip in latch, resulting in a soft error. As the stored charge in each logical node decreases with decreased dimensions and decreased voltages in nanometer technology, even weak radiations can cause disturbance in the signals, which results in increased soft error failure rate.
SET sensitivity of a node depends on the probability that there is a functionally sensitized path from the node to an output latch (which is the same as the fault detection probability of the node), probability that the generated SET is propagated to the latch and the probability that the latch captures the transitions arriving at its input [1] . If the last two probabilities are assumed to be one, FDP of a node is an accurate measure of the SET sensitivity. In nano-domain circuits, these probabilities are very close to one due to very high operating frequencies, 
P¢ x
In this BN, the random variable, X 6 is independent of X 1 , X 2 and X 3 given the states of its parent nodes, X 4 and X 5 . This conditional independence can be expressed by Eq. 2.
Mathematically, this is denoted as I
In general, in a Bayesian network, given the parents of a node X, X and its descendents are independent of all other nodes in the network. Using the conditional independencies in Eq. 2, we can arrive at the minimal factored representation shown in Eq. 3.
In general, if x i denotes some value of the variable X i and pa ¢ x i ¤ denotes some set of values for X i 's parents, the minimal factored representation of exact joint probability distribution over m random variables can be expressed as in Eq. 4.
IV. LIFE-BN: FAULT/ERROR MODEL
We first discuss the basics of fault/error detection probabilities for random-pattern testability analysis. Note that the probabilistic modeling does not require any assumption in the input patterns and can be extended to biased target workload patterns. 
1.
Z : Primary inputs.
2.
F : Nodes representing injected faults.
3.
X : Internal nodes in the fault-free circuit C. Example:
4. : Edges from the primary inputs to the duplicate gates in S f . Note that this edge indicates that there must be at least one parent of X f that is in
: Edges between the nodes representing the internal signals (Edges from the input of a gate to the corresponding output). Theorem: The LIFE-DAG structure, corresponding to the combinational circuit C and a Fault set F is a minimal I-map of the underlying dependency model and hence is a Bayesian network.
5.
Proof: Markov Boundary of a variable v in a probabilistic framework, is the minimal set of variables that make the variable v conditionally independent of all the remaining variables in the network.
Let us order the random variables in the node set, "
With respect to this ordering, the Markov boundary of any
is given as follows. If v represents an input signal line, then its Markov boundary is the null set. If v is a fault node f in the fault set F, then also its Markov boundary is the null set (since this is treated as a primary input with a particular value). And, since the logic value of an output line is just dependent on the inputs of the corresponding gate (whether v is in 
where Pa
is the set of nodes that has directed edges to X ) k . A complete specification of the conditional probability of a two input AND gate output will have 2 3 entries, with 2 states for each variable. These conditional probability specifications are determined by the gate type. By specifying the appropriate conditional probability we ensure that the spatial dependencies among sets of nodes (not only limited to just pair-wise) are effectively modeled.
V. BAYESIAN INFERENCE
We explore two stochastic sampling algorithms, namely probabilistic Logic Sampling (PLS) and Evidence Prepropagated Importance Sampling (EPIS). These methods have been proven to converge to the correct probability estimates [6] , [7] , without the added baggage of high space complexity.
A. Probabilistic Logic Sampling (PLS)
Probabilistic logic sampling is the earliest and the simplest stochastic sampling algorithms proposed for Bayesian Networks [7] . Probabilities are inferred by a complete set of samples or instantiations that are generated for each node in the net- The above scheme is efficient for predictive inference, when there is no evidence for any node, but is not efficient for diagnostic reasoning due to the need to generate, but disregard samples that do not satisfy the given evidence. It would be more efficient not to generate such samples. We discuss such a method next.
B. Evidence Pre-propagated Importance Sampling (EPIS)
The evidence pre-propagated importance sampling (EPIS) [5] , [6] uses local message passing and stochastic sampling. This method scales well with circuit size and is proven to converge to correct estimates. This is also an anytime-algorithm since it can be stopped at any point of time to produce estimates. Of course, the accuracy of estimates increases with time.
Like PLS, EPIS is also based on importance sampling that generates sample instantiations of the whole DAG network, i.e. for all line states in our case. These samples are then used to form the final estimates. The difference is with respect to the importance function used for sampling, which for EPIS takes into account any available evidence. In a Bayesian network, the product of the conditional probability functions at all nodes form the optimal importance function. Let X ¥ X 1 ,
be the set of variables in a Bayesian network, Pa¢ X k ¤ be the parents of X k , and E be the evidence set. Then, the optimal importance function is given by
This importance function can be approximated as
where α¢ Pa¢ X k ¤ 1 ¤ 6 ¥being the evidence from parents and children, respectively, as defined by the directed link structure. Calculation of λ is computationally expensive and for this, Loopy Belief Propagation (LBP) [21] over the Markov blanket of the node is used.
Yuan et al. [6] proved that for a poly-tree, the local loopy belief propagation is optimal. The importance function can be further approximated by replacing small probabilities with a specific cutoff value [5] .
This EPIS stochastic sampling strategy works because in a Bayesian Network the product of the conditional probability functions for all nodes is the optimal importance function. Because of this optimality, the demand on samples is low. We have found that just thousand samples are sufficient to arrive at good estimates for the ISCAS'85 benchmark circuits. The ability of EPIS to handle diagnostic and predictive inference comes at the cost of a somewhat increased time per iterations needed for the calculation of λ messages. We quantify this increase by our experiments.
C. Time and Space Complexity
The space requirement of the Bayesian network representation is determined by the space required to store the conditional probability tables at each node. For a node with n p parents, the size of the table is 2 n p N is the number of samples, which, from our experience with tested circuits, is in the order of 1000's for circuits with 10,000 c432  524  4  5  c499  758  8  9  c880  942  0  32  c1355  1574  8  32  c1908  1879  9  114  c2670  2747  117  435  c3540  3428  137  218  c5315  5350  59  78  c6288  7744  34  34  c7552  7550  131  586 of signals and 50,000 nodes in the fault detecting logic.
VI. EXPERIMENTAL RESULTS
We demonstrate the ideas using ISCAS benchmark circuits.
The logical relationship between the inputs and the output of a gate determines the conditional probability of a child node, given the states of its parents, in the LIFE-BN. Gates with more than two inputs are reduced to two-input gates by introducing additional dummy nodes, without changing the logic structure and accuracy.
The LIFE-BN based model is capable of detecting stuckat-faults as well as the soft errors caused by single-eventtransients. In this work, we present results for a set of stuckat-faults (hard faults), which shows the efficiency of our model, in time and space requirements, compared to the BDD based model. First, we determine the hard faults using 1024 random input vectors [4] . Faults that are not detected by these vectors are hard faults. We tabulate the hard faults in Table I for all the benchmark circuits. Accurate detection probabilities are needed for these hard faults.
We performed experiments using both PLS and EPIS inference schemes. Recall that the Probabilistic Logic-Sampling (PLS) scheme is simple and time-efficient, but cannot handle diagnostic evidence efficiently. However, the Evidence Prepropagated Importance Sampling Algorithm (EPIS) [6] ) can efficiently handle diagnostic evidence, but with increased time for cases when there is evidence. We performed an in-house logic simulation with 500 £ 000 random vectors to detect the ex-act fault detection probability of all the faults and used these probabilities to check the accuracy of our model.
The results of detection probabilities computed by Probabilistic Logic Sampling (PLS) [7] , and Evidence Pre-propagated Importance Sampling EPIS [6] are shown in Table II We partitioned the faults in circuits c3540, c6288 and c7552
into three subsets and determined the detection probabilities in each set by parallelly running the circuits for all the fault sets.
We empirically demonstrate the linear dependence of estimation time on number of samples in Figure 3 In Table III , we compare LIFE-BN fault modeling with the performance of approaches based on the Binary Decision Diagram (BDD) model, as reported by Krieger et al. [4] for these same circuits. They reported results using four type of fault partitioning. We compare our time (column 4) with the time taken by their two best methods, namely and PSG (column 2) and Supergate SG (column 3). In column 5, we report the ratio be- Even though [4] explains BDD based method of FDP computation, the complete algorithm for different types of decomposition techniques could not be obtained from this paper.
Hence we could not give a direct comparison of cpu time between the BDD based model and the LIFE-DAG model by reimplementing their algorithm in the same computer we used for experimenting our model.
VII. RELATED WORK
Due to the high computational complexity involved in computing signal and fault detection probabilities, several approximation strategies have been developed in the past [12] , [17] , [19] , [20] . The cutting algorithm [20] , computes lower bounds of fault detection probabilities by propagating signal probability values. This algorithm delivers loose bounds, which may lead to unacceptable test lengths. Also, computing complexity of this algorithm is O¢ n 2
¤
. Lower bounds of fault detection probability were also derived from controllability and observability measures [19] . This method do not account for the component of fault detection probability due to multiple path sensitizations.
The above mentioned methods are satisfactory only for faults that have single sensitizing path for fault propagation to an output and hence will not give good results for highly re-convergent fan-out circuits that have multiple path sensitizations.
PREDICT [14] is a probabilistic graphical method to estimate circuit testability by computing node controlabilities and observabilities using shannon's expansion. The time complexity of exact analysis by this method is exponential in the circuit size. PROTEST [17] , which is a tool for probabilistic testability analysis, calculates fault detection probabilities and optimum input signal probabilities for random test pattern, by modeling the signal flow. Fault detection probabilities, which are computed from signal probability values, are underestimated due to the fact that the algorithm does not take into account multiple path sensitization. Another method (CACOP) [12] is a compromise between the full range cutting algorithm and the linear time testability analysis, like the controllability and observability program. This is also an approximate scheme.
[18] uses supergate decomposition to compute exact fault detection probabilities of large circuits. PLATO (Probabilistic Logic Analyzing Tool) [4] is a tool to compute exact fault detection probabilities using reduced ordered binary decision diagrams (ROBDD)s. Space requirement for constructing the ROBDD of large circuits is very large. Shannon decomposition and divide-and-conquer strategies are used to reduce large circuits into small sub-circuits. Computing complexity of these decomposition methods are quite high. Another BDD based algorithm [16] computes exact random pattern detection probabilities. However, this is not scalable for large circuits.
VIII. CONCLUSIONS AND ONGOING WORK
We present a non-simulative probabilistic method for estimating fault/error detection probability for testability and soft error sensitivity analysis. Given a circuit and a fault/error set F, we model the faults by a Logic-Induced-Fault-Encoded (LIFE) DAG, which is a Bayesian Network, exactly capturing all high order dependencies among the signals. We explored two closeto-exact inference schemes for evaluating the detection probabilities. We find that
The estimates are almost error-free.
The LIFE-BN approach appears to be 500% more time efficient than a BDD based one.
The LIFE-BN approach handles all the benchmark circuits without the need for special case handling, unlike BDD based approaches.
The LIFE-BN approach does not rely on approximations to handle large circuits. However, BDD based methods rely on several decomposition techniques to reduce large circuits into smaller components. Also, this pre-processing time required for the circuit decomposition are usually not reported.
BDD based methods have been reported to have trouble with some benchmark circuits such as c6288, whereas our model can handle such circuits.
The space requirement of the LIFE-BN is O¢ n ¤ , whereas the space requirement of exact BDD approach is exponential in the worst case.
We use an exact probabilistic model for detection probability of errors that can be uniformly applied to permanent stuckat faults as well as soft transient errors predominant in nanodomain. Note that the probabilistic modeling does not require any assumption in the input patterns and can be extended to biased target workload patterns. Existing estimation techniques for SET [2] , [3] rely on simulation and hence are modeling the pattern dependence of SET by estimation methods that are in itself pattern-sensitive.
We are currently experimenting Bayesian Networks to backtrack probabilistically for ATGP and exploring SET sensitivity measures incorporating circuit delays.
Nov. 2003. G H H I P Q R P R Q S P S Q Q P P R P P P R Q P P S P P P S Q P P 
