We develop a new statistical technique for estimating delay fault coverage in combinational circuits. True value simulation is performed for a sample of vector pairs chosen randomly from the test set. Transition probabilities and observabilities are estimated from the simulation data.
Introduction
Statistical methods reduce the complexity of fault analysis 8]. Whether we use statistical fault analysis 6] or statistical sampling of faults 2], all vectors must be simulated. The total number of vectors, in certain applications, can be very large. We propose a new technique of estimating fault coverage in combinational circuits by fault-free simulation of a random sample of the test vector set. Preliminary results for this work were presented earlier 5, 6] . We have considered delay faults 12] although the application of this technique to stuck-at faults may also be possible.
The main idea is to randomly select a subset of vectors (vector pairs for delay faults). A set of (M + 1) vectors contains M ordered pairs. Once a random subset of N pairs is selected, they can be simulated in any order as long as their pairwise ordering is not changed. During simulation, statistics on transition probabilities are obtained and analyzed as explained in Section 2 6] . This allows us to estimate detection probabilities for any type of delay faults. However, since the probabilities are not determined by simulating the entire vector set, they contain a sampling error.
This error is statistically estimated to nd a high con dence lower bound for detection probabilities.
We use the lower bound detection probabilities to obtain a pessimistic estimate of the coverage for M vector pairs. The pessimism reduces as the sample size is increased. The factor L appears because we perform good machine simulation. The factor M does not appear in the vector sampling complexity expression as we sample a xed number of vector pairs which is dependent on our accuracy requirements and independent of the total number of test vectors.
The factor F does not appear in the complexity expression as the method only requires fault-free simulation. The reduction in complexity of fault coverage estimation becomes very signi cant in applications like BIST where the number of vectors is typically very large.
Pomeranz and Reddy proposed a non-enumerative, non-statistical method 10] to estimate a lower bound on the path delay fault coverage. They approximately compute the number of new path delay faults detected by a test set by agging the new circuit lines tested by the test set. This estimate is a lower bound on the actual path delay fault coverage. They introduce cut sets into the circuit to improve the accuracy of the estimate, at the expense of increasing the computational complexity. Heragu et al. 7] have recently improved on their method to obtain accurate path delay fault coverage without exponential computational complexity.
Statistical Fault Analysis
The vector sampling technique applies to statistical fault analysis which determines detection probabilities of faults. Since the present discussion is for delay faults, we will require the following parameters 6]:
Cr(l) (Cf(l)) or rising (falling) controllability of a line l is the probability of line l having a rising (falling) transition on a randomly selected vector pair. O L r(l) (O L f(l)) or local rising (falling) observability of a line l is the conditional probability of observing a rising (falling) transition on line l at the gate output (l being one of the gate inputs) on a randomly selected vector pair, given that l has a rising (falling) transition.
Sr(l) (Sf(l)) or rising (falling) sensitization probability of a line l is the joint probability of line l having a rising (falling) transition and having the transition propagated to the gate output (l being one of the inputs of the gate) on a randomly selected vector pair.
We de ne the following parameter for each fault:
The detection probability of a fault is the probability of detecting that fault on a randomly selected vector pair. For example, if 10 out of 1000 vector pairs detect a fault, then the detection probability of that fault is 10/1000 = 0.01.
Statistical Data Collection
Tests for delay faults consist of two vectors applied consecutively, the rst to set up the required initial conditions and the second to activate the fault. We simulate the circuit for pairs of vectors.
We use a thirteen-valued algebra 4] for simulation. The algebra can di erentiate between robust and non-robust tests. Each of the 13 values is represented by a triplet, in which the rst and last values are the signal states for two consecutive time frames. The middle value, h or nh, represents the presence or absence of a hazard. We have a set of counters for each line. The rising (falling) transition counter of line l is incremented whenever the line has a rising (falling) transition on a vector pair. The rising (falling) sensitization counter of l is incremented whenever the output of the gate with l as an input has a robustly propagated transition on it and l has a rising (falling) transition. Non-robust coverages are computed by considering non-robustly propagated transitions.
The counters add only a small overhead to fault-free simulation. After the simulation of N vector pairs, we calculate:
Cr(l) = (rising transition counter) / N Sr(l) = (rising sensitization counter) / N A similar calculation is done for falling transitions. The above de nitions apply to both (transition and path) delay faults. However, observabilities must be computed separately for each type of fault.
Detection Probabilities of Transition Faults
For transition faults, we deal with signal correlation as in STAFAN 8] . Consider the observability computation for a line l, which is the input of an AND gate with output o. Sr(l) is the joint probability of two events: (a) Line l has a rising transition, and (b) Its value is observable at o.
Since the probability of line l having a rising transition is Cr(l), the probability of observing the state of l at o, given that l has a rising transition, equals Sr(l) = Cr(l). The A transition on l can be observed either through a or b. If the paths from a and b to POs were independent, then observability of line l is the probability of the union of the two events whose probabilities are the observabilities of a and b. If they were totally dependent, then we assume the other extreme and the observability of line l is taken to be the maximum of the observabilities of a and b. Experiments show that we can assume either of the two cases at fanout stems without signi cantly a ecting the estimation of fault coverage, so we use the maximum observability. A similar analysis holds for falling transitions.
We consider both rising and falling transition faults. The rising (falling) detection probability of a transition fault on a line is the product of the rising (falling) observability and rising (falling) controllability of that line.
Detection Probabilities of Sampled Path Delay Faults
Since the number of paths in a circuit can be very large, we devise an implicit sampling procedure and consider a random sample of all paths for fault coverage estimation. Implicit sampling here means that we do not sample from a list of all paths but the random sampling process is included in observability calculation. The sample size can be xed depending on the desired accuracy of the estimate 2].
For every fanout branch, we have a flag that indicates whether or not the branch has been included in a path fault considered for the fault coverage computation. All flags are initially set to 1 indicating that the branches have not been part of any path fault. We also maintain an indicator, new-path, for every line. This indicates whether or not the path segment being sampled from the line to a PO is part of some path previously considered for the fault coverage computation. All new-path indicators are initially set to false. The flag of the fanout branch picked is 1, or
The new-path indicator of the fanout branch selected for observability propagation is true.
Once the observabilities are computed at the fanout stem, the flag of the fanout branch selected for stem observabilities is set to 0. The observability computation for a line l that is an input of a gate is similar to that for transition faults. The new-path indicator of l is set to true if the new-path indicator of the gate output is true.
After a backward pass at every primary input (PI) p (or fanout branch p of a PI, if the PI fans out), we have controllabilities and observabilities of transitions from p to a PO through a randomly selected path. We consider the path fault for fault coverage computation only if the newpath indicator of p is true. The product of the rising (falling) controllability and rising (falling) observability of line p gives the rising (falling) fault detection probability of a path fault originating at p. We initially set the number of paths, F, which is required to be sampled for computation of the fault coverage. This depends on our accuracy requirements. We make repeated backward passes over the circuit for observability calculation until the number of faults considered is equal to or greater than F. Note that at a PI, we could have determined the observability of a path previously considered for fault coverage computation. However, such cases are avoided since the new-path indicators on all lines are set to false after each backward pass. For computing the fault coverages, we consider rising and falling transitions on PIs or their fanout branches if they fan out.
Detection Probabilities of Longest Delay Path Faults
We propose a method to compute fault coverages with respect to a subset of all paths. We choose a minimum set of paths, such that each signal lead is included in at least one target path whose propagation delay is no less than the delay of any path containing the lead 9]. Note that a path can be the longest path through more than one line in a circuit. Hence, we need to consider the longest paths through only a few lines to cover the entire set of longest paths satisfying the above conditions. To avoid duplication of longest paths picked for fault coverage computation, we use the following result: Longest Path Theorem: In a circuit, the minimum set of paths from inputs to outputs such that each circuit lead is included in at least one path whose propagation delay is no less than the delay of any path containing the lead consists of the longest paths through (a) All PIs and (b) All fanout branches except the ones having the highest level number (with respect to a PO) at each fanout stem.
Proof: In a combinational circuit, a path is uniquely represented by specifying the PI where it originates and the fanout branches, if any, along the path. Hence, the longest path through the lines that are not fanout branches will also be one of the following:
The longest path through some PI, or
The longest path through some fanout branch.
Consider the analysis of fanout branches. We illustrate with the circuit of Figure 1 , where lines are given level numbers (in parentheses) to indicate their maximum distance from POs. Consider the analysis at fanout stem r. The longest path through fanout branch o having the highest level number is also the longest path through PIs k and a. This is because at fanout stems, the branch having the highest level number is picked for determining longest paths through PIs. Therefore, we must avoid such paths when considering longest paths through fanout branches. At all fanout stems, the longest path through the fanout branch having the highest level number has already been picked as the longest path through some PI and hence we avoid choosing the path corresponding to that fanout branch for the set of longest paths. We consider all the longest paths through other fanout branches to meet the requirement that every lead should be included in at least one selected path. Thus, the longest paths through all PIs and a selected set of fanout branches comprise the entire set of longest paths considered for coverage computation. 2
Although the set of circuit nodes considered are checkpoints, this theorem should not be confused with the checkpoint theorem that deals with fault equivalence of stuck-at faults by considering faults at all PIs and fanout branches 1].
The selection of paths is based on the assumption that the paths most likely to fail are those with longest delay. We de ne additional parameters for each line: O I r(l) (O I f(l)) or rising (falling) PI-observability of a line l is the conditional probability of observing a rising (falling) transition on a PI at line l through the longest segment from a PI to line l on a randomly selected vector pair, given that the PI has a rising (falling) transition.
O O r(l) (O O f(l)) or rising (falling) PO-observability of a line l is the conditional probability of observing a rising (falling) transition on line l at a PO through the longest segment from line l to a PO on a randomly selected vector pair, given that l has a rising (falling) transition.
Following fault-free simulation of a set of N vector pairs, we make two passes over the circuit.
Observability propagation, as accomplished by these passes, is illustrated next by an example.
In Figure 1 , we make a backward pass over the circuit where lines are given level numbers (in parentheses) to indicate their maximum distance from POs. For example, the label l(4) means that the maximum distance of line l from POs is 4. We write dist-PO(l) = 4. For a line i which is the output of a gate, we label all inputs of that gate as (dist-P O(i) + 1). For every fanout stem i with branches i 1 , i 2 , ..., i k , we label line i as maximum 1 j k (dist-P O(i j )). In practice, the distance can be weighted according to any given delays of circuit elements. The rising (falling) observability of a line l computed during the backward pass is denoted by O O r(l) (O O f(l)). The observability computation for a line l, which is not a fanout stem, is similar to that for the transition model.
But at a fanout point, the stem assumes the observability of the branch having the highest level number (with respect to a PO). If there are two or more branches with identical level numbers, only one is selected to minimize the set of longest paths. In Figure 1 , the rising PO-observability of line l with respect to a PO through the longest segment, O O r(l), is given by:
Since at fanout stem r, the fanout branch o has a higher weight (level number in this case) than branch p, the rising PO-observability of r is given by:
After the backward pass, the rising PO-observability of line l will be:
The backward pass is then followed by a forward pass over the circuit during which local line observabilities are multiplied. Referring to Figure 
The observability of line b is chosen over that of d for propagation due to its higher level number (with respect to a PI).
During the forward pass, for every line, we obtain the nal rising (falling) observability by multiplying the rising (falling) PI-observability and PO-observability and dividing this product by the local observability of the line. From equations (1) and (2), for line l, we have the nal rising observability of the longest path through it as:
This exactly equals the product of observabilities of lines on the longest path through l. Since the controllability information carried along to l is the controllability of a (a PI), the product of the rising (falling) controllability and rising (falling) observability calculated at l gives the rising (falling) detection probability of the fault corresponding to the longest path through l. After the two passes, every line will have the rising and falling path delay fault detection probabilities of the longest path through it. To compute the fault coverage, we consider rising (falling) detection probabilities of longest paths through all PIs and a selected set of fanout branches, as determined by the longest path theorem.
Results for Statistical Analysis
Once detection probabilities are computed for any fault model, the fault coverage is obtained as in STAFAN 8] . Table 1 gives the results for transition and all path delay faults. All execution times are for a SUN 4/280 workstation. Fault simulation data are provided for comparison. The main observation made here is that the error of the statistical estimates is within 2% and the time of computation is smaller than that for fault simulators. As the circuit size, and hence the number of faults, increases, the accuracy of the statistical fault analysis will improve.
Test Vector Sampling
The complexity of statistical fault analysis can be further reduced by vector sampling without a ecting its accuracy. We simulate the circuit for pairs of adjacent vectors, which are chosen randomly from the complete test vector set. Once we determine fault detection probabilities us- 
The exact detection probability x M (f) can be computed if all vector pairs were simulated. However, we must estimate x M (f) from the data obtained by simulation of only a subset of vector pairs. In our previous work 5], we used the sample probability x N (f) as an estimate for x M (f). For small samples, this can lead to large error in coverage estimation. We, therefore, estimate x min (f), a lower bound estimate for x M (f). The random variable n is known to have a hypergeometric density function. For large M and N, however, the probability of having n vector pairs that do not detect a given fault in the sampled set is Gaussian 2] with the following mean and variance:
Since N < M, the exact number of vector pairs, m, that detect the fault is not known. We estimate a lower bound on the value of x M (f) from the data we gain by simulating N vector pairs. For small , we require and assuming a Gaussian density function for n, we obtain: 1 2 ? erf n?a(n) Notice that the detection probability of a fault is the probability of its detection by a randomly selected vector pair. We can de ne two coverage estimates. The rst of them is the sample coverage FC 5] and the second is a lower bound on the statistical coverage, FC min . FC min is not necessarily a lower bound on the exact fault coverage. In order to determine FC, for a fault f the detection probability is taken as x N (f). The probability of detection by the entire test set having M vector pairs is then:
In general, X(f) contains a biased error and its unbiased value is given by 8]:
where W(x N (f )) = 1 + N ?1 To compute the sample fault coverage FC, we take the average of the detection probabilities of all faults:
where X(f i ) is given by Equation 4 . Figure 3 shows this coverage for transition faults for the s1494
circuit. As shown in Table 1 , the statistical fault analysis coverage for 3,852 vectors is 98.1%. The set of 3,852 vectors has 3,851 ordered pairs. We generate 50 trials each consisting of 600 vector pairs randomly taken from these ordered pairs. The frequency distribution of sample coverages computed from Equation 5 is shown in Figure 3 . We notice that the sample coverage varies between 97.3% and 98.7%. By increasing the sample size, the spread of the estimate can be reduced. However the error will be two-sided. In order to nd a more robust estimate, which will be highly unlikely to exceed the statistical fault analysis coverage, we derive the second type of coverage FC min which is a lower bound estimate.
Equation 3
gives a lower bound on the detection probabilities per vector pair. The corresponding lower bound on the probability of detection by the complete vector set is:
where W(x min (f )) = 1 + N ?1 6 2 x min (f )
1?x min (f ) . The cumulative fault coverage, FC min , for F faults is computed as the average of the detection probabilities of all faults. We have:
two existing fault simulators 3, 11]. All execution times are for a SUN 4/280 workstation. Although the tables show non-robust coverages for the fault models considered, our method can also be used to nd robust fault coverages by considering robustly propagated transitions for statistical estimation. Table 2 gives the non-robust coverage of transition faults, for deterministic vectors, as estimated by the vector sampling method for the same examples as used in Table 1 . A sample size of 600 was used in every case. While the sample coverage FC is quite close to the statistical coverage, as explained earlier, it can exceed the coverage. But the lower bound, FC min is always pessimistic.
Savings in CPU time by vector sampling is signi cant over the statistical method. Savings over fault simulation is even greater. Table 3 shows non-robust coverage of transition faults for 30,000 random vectors. A sample size of 2000 was used in every case. Although there is a variation in the test set size for circuits in Tables 1 and 2 , the number of vector pairs to be sampled can be constant for large test sets irrespective of the total number of vectors. The variation in the sample size was due to the relatively small size of the test set. Tables 2 and 3 show non-robust coverages for path faults for deterministic and random vectors, respectively. We xed the number of backward passes following the true-value simulation at five (see Section 2.3). In practice, the number of passes can be chosen dynamically depending on the number of faults sampled in each pass and the total number of faults we wish to sample. Results for the circuits in Table 3 were not available from the fault simulator 3]. Notice that the time taken by vector sampling and statistical analysis for path faults is only slightly higher than that taken for transition faults since the complexity of these methods is linear for any fault model. Tables 4 and 5 show non-robust coverages for the longest path faults for the deterministic and random vectors, respectively. We have not given a comparison with fault simulation for longest paths due to the unavailability of a longest path fault simulator. Based on our results for the transition and all path delay faults, we expect the coverage estimates for longest paths to be good.
We estimated the transition fault coverage with di erent sample sizes for the circuit s38417
of a test set of 30,000 random vectors. Results are given in Figure 4 In addition to determining fault coverages, the vector sampling estimator can also be used for fault dropping by a test pattern generator. After the simulation of a sample of vector pairs, we determine the fault coverage with respect to the entire test set. We grade all faults and order them in terms of decreasing detection probabilities. As an illustration, we analyzed the s382 circuit for transition fault coverage using 1408 vectors. The total number of faults considered was 666. We graded these faults and ordered them in terms of decreasing detection probabilities. Our estimator reported the fault coverage as 98.7%, and hence we took the rst 666 0:987 = 657 faults in the graded list as detected and 9 as undetected. The actual fault simulator reported 8 faults as undetected. Of the 9 faults that were considered as undetected by our estimator, 7 were actually undetected. One fault, which was actually undetected, was reported as detected by the estimator.
This is a very small error in comparison to the total number of faults. In this case, if the faults are graded in terms of decreasing detection probabilities, we can drop the rst 657 faults (having the highest detection probabilities) from the fault list. Thus, one fault will be incorrectly dropped as it is actually not detected.
Application to BIST
The sampling technique will be very useful for applications with a large number of test vectors. We consider a typical application 11] which makes use of a pattern generator to generate a large number of patterns for built-in self-test (BIST) for delay faults. The typical number of test patterns for the BIST technique may run into millions, but we will only consider two small circuits with a moderate number of vectors for illustration. Table 6 shows FC, the sample coverage and FC min , the lower bound as estimated by vector sampling. Coverages from a transition delay fault simulator 11] are also reported. Note that we obtained good results with a sample size of 5000 in both cases although the size of the entire set was much larger and di erent for the two cases. The sample size can be kept constant for large test sets, as the accuracy of estimation depends on the sample size and not on the size of the test set.
Conclusion
The new technique of estimating fault coverages in digital circuits by fault-free simulation of a random sample of test vector pairs is a practical method. The robustness criteria are embodied in the algebra used for good machine simulation and in the criteria used to increment the sensitization counters for each gate. By changing the algebra and sensitization criteria, we can perhaps extend the method to non-robust delay testing. A large speedup can be obtained for applications like BIST where large test vector sets are involved. We are currently investigating methods to extend our technique to sequential circuits. Fig. 3 . Distribution of sample transition fault coverage for s1494 (sample size 600) { 50 trials. Table 2 . Non-robust coverage of transition and path delay faults { Deterministic Vectors. Table 3 . Non-robust coverage of transition and path delay faults { Random Vectors. Table 4 . Non-robust coverage of longest path delay faults { Deterministic Vectors. Table 5 . Non-robust coverage of longest path delay faults { Random Vectors. g (1) h (1) i (2) k (1) a (1) b (2) d (1) l (3) n (1) m (3) e (1) f (4) o (4) x(5) q (1) y (6) p (4) j (5) r (4) c(3) Figure 2 : Forward pass over the circuit. 
