When using Built-In Self Test (BIST) for testing VLSI circuits, a major concern is the generation of proper test patterns that detect the faults of interest. Usually a linear feedback shift register (LFSR) is used to generate test patterns.
Introduction
Built-In Self-Test is the capability of a circuit to test itself. The idea behind BIST is to create pattern generators (PGs) to generate test patterns for the circuit and response analyzers (RAs) to compact the circuit response to the test patterns that are applied. The PGs and RAs are usually implemented from existing registers. Some registers are used as both a PG and a RA.
In this paper we deal with the design of e cient and e ective pattern generators based on linear feedback shift registers (LFSR) . Their e ectiveness is measured in terms of generating test patterns for all the faults of interest, and e ciency in terms of minimum test length (time). Both ends will be accomplished by proper selection of the feedback polynomial (con guration) and the initial seed (state) of the LFSR. In a complementary work 13] we show how to select feedback polynomials to achieve e ective (zero-aliasing) compaction. When a register is to function both as a PG and a RA, by selecting the feedback polynomial to be the one that achieves the best PG results from a set of zero-aliasing polynomials, both ends are achieved simultaneously.
An n-stage LFSR with a primitive feedback polynomial generates a permutation of all the non-zero binary n-tuples. Changing the polynomial changes the permutation. The seed speci es the starting position for scanning the permutation. To achieve 100% detection of the faults of interest the LFSR must generate patterns until the set of patterns contains at least one test pattern for every fault. The problem is that the permutation might not contain a relatively short subsequence of test patterns for all the faults, and even if it does, it is usually not known where this short subsequence begins.
There are two approaches to deal with the problem of generating all the required test patterns. The rst is by trying to estimate the number of test patterns required to achieve 100% fault coverage (fc). The second is by adding hardware to guarantee 100% detection with a small test set. While the rst approach keeps the hardware at a minimum, it signi cantly adds to the required test time. The second approach results in a very short test time, but requires signi cant hardware overhead. We will rst expand on the above approaches and then discuss our ideas for the design of LFSR-based PGs.
To estimate the number of patterns required to achieve 100% fc, a number of authors 14] 18] 22] asked how many test vectors need be applied to achieve a given fc with a speci ed degree of con dence. The answers depend on the detectability pro le of the circuit, i.e. the number of test patterns for each fault. For example, to achieve 100% fc when the probability of detection of the hardest fault is p, Savir and Bardell 18] suggest using a test sequence of length 11=p. The main motivation behind this question is that it eliminates the need for test generation and for fault simulation. With the advances in test generation and fault simulation tools (not to mention platform speed), we believe this is not as important as it used to be. On the other hand, shorter test sequences for combinational subcircuits allow for shorter test sessions for the whole circuit and allow for easier synthesis of zero-aliasing response analyzers, as we show in 13].
As opposed to trying to detect all the faults of interest with one pseudo-random sequence, a second approach is to either use special hardware to generate small test sets, or to use a short pseudo-random sequence to detect most of the faults, and then use special hardware to detect the remaining faults. The use of non-linear feedback functions are suggested in 6] and 7]. A related approach is that of weighted random patterns 23], 3], 15]. A store-and-generate approach is suggested in 2]. A di erent implementation of this idea is suggested in 1], although the authors note that it is very unlikely that their scheme will work for realis tic test sets. Related schemes that use special linear functions to generate tests for random resistant faults are discussed in 11], 8] and 21]. A di erent type of ROM-based scheme is suggested in 9].
All the above PG schemes cause additional area and delay overhead compared to a LFSRbased PG, although they reduce the size of the test set, in some cases considerably. Another drawback of some of these schemes is that changing the feedback function of the LFSR makes the compaction properties of the resulting circuits extremely di cult to analyze.
In this paper we try to nd short one-seed pseudo-random test sequences that achieve 100% fc. The term short is relative to the probability of detecting the hardest fault of a circuit. If this probability is p, then short will mean a test length of L = 1 p . By using just one seed the BIST control circuitry is kept at a minimum. We rst show that a pseudo-random test sequence of length at most 2L has a high probability of achieving 100% fc. Thus, a random process of selecting a random primitive polynomial and a random seed for a LFSR-based PG is likely to produce the desired test length. By simulating up to 2L test patterns, one knows whether the desired sequence is found or not. If not, another random selection is made. This scheme will be best suited for randomly testable circuits, i.e. those circuits with high values of p. For random resistant circuits we suggest a more sophisticated way of selecting the feedback polynomials and seeds that will produce shorter test sequences and require less computation time. We use the theory of discrete logarithms to embed a subset of test patterns in a LFSR sequence, from which we produce the test sequence for all faults. The applicability of these schemes is dependent on the computational e ort one is willing to expend and on the time a test sequence is allowed to run.
The tests we embed can either be one pattern tests for non-sequential faults, or, using schemes such as in 20], two-pattern tests for sequential faults.
As the pattern sequence of LFSRs and one-dimensional Cellular Automata (CA) with the same primitive characteristic function are isomorphic, 19], our algorithms will also work for CAs.
The rest of this paper is organized as follows. Throughout the paper, n denotes the number of inputs to the circuit under test (CUT) . In Section 2 we analyze the probability of detecting a fault with 2 k+i test patterns, given a test sequence of length L = 2 n?k and we give lower and upper bounds on the probability of detecting all the faults of interest. In Section 3 we assume a detectability pro le model and derive the probability bound values for 100% fc for circuits that abide by this model. In Section 4 we nd the actual detectability pro le for some example circuits, derive the probability of drawing short test sequences and conduct random experiments which validate our analytic results. These sections provide the basis for our claim that short test sequences can be found in acceptable time constraints. We then proceed to introduce our procedures for selecting primitive feedback polynomials and seeds for the LFSRs. In Section 5 we introduce the notion of discrete logarithms and show its relation to the sequencing of patterns by a LFSR. In Section 6 we state the test embedding problem and propose our solution. Section 7 de nes and characterizes faults we classify as hard faults, which are of major importance to our algorithm. In Section 8 we present experimental results. We conclude with Section 9.
2 The probability of 100% fault coverage with a random test sequence of length L Consider a combinational circuit with n inputs. In this section we address the following two questions. Combining Equations (3) and (4) under the assumption t = wK < L results in "
and as K increases, the bounds become tighter.
We de ned the sequence length to be L = N K . If the sequence length is doubled, the probability of missing a fault with t test patterns is squared. By Equation (2) p nd (t; 2L) < Similarly, if the test sequence length is halved (L=2) the probability is the square root of its value for a sequence of length L.
Using the values of p nd (2 k+i ; L), we would like to bound from above and below the probability that a sequence of length L detects all the faults of interest. We do this by taking for each value of t the closest power of 2 greater or equal to t and the closest power of 2 less than or equal to t. By considering the power of 2 greater than t we derive the upper bounds, and by considering the power of 2 less than t we derive the lower bounds.
Let F = ff 1 ; f 2 ; : : : ; f s g be the set of all the faults of interest in the CUT. Let t i be the number of tests for fault f i and let k i = dlog t i e. Let k min = min i fk i g and k max = max i fk i g. Group the faults into subgroups C k min ; C k min +1 ; : : : ; C kmax , where f i 2 C j i k i = j .
Let k = k min . We are interested in nding the probability that a random sequence of length L detects all the faults of F .
Let p j be the probability that a fault in C j is not detected by the sequence. Since a fault in C j has between 2 j?1 and 2 j test patterns, the probability p j is lower bound by p j p nd (2 j ; L (5) The actual value of q will be closer to the upper (lower) bound when for most of the harder faults the number of test patterns is closer to the power of 2 from above (below).
The super-exponential decrease in the probability of not detecting a fault, as we move from C j to C j+1 , allows us to consider only the rst few subgroups of faults when estimating the probability q. This is an analytic explanation to a similar observation made in 18]. To see this, consider the following question. How many faults x j , with 2 j+1 test patterns are needed such that the product of their detection probability equals the detection probability of a 3 ), and 8886114(> e 2 4 ) respectively. This means that when computing the bounds on q, the signi cance of one fault with 2 k+4 test patterns to the probability of success is greater than the contribution of e 16 faults with 2 k+5 test patterns. Equivalently, the probability of detecting a single fault with 2 k+4 test patterns is less than detecting e 16 faults with 2 k+5 test patterns. Thus, the faults in the subgroups C d+5 ; : : : C dmax and F k+5 ; : : : F kmax do not have to be taken into account when analyzing the probability of success.
3 Detectability pro le models
The actual value of the probability q depends on the distribution of faults in the di erent subgroups. To get an idea of this value as a function of the number of faults in C k and F d , we assume a detectability pro le model placing an emphasis on the rst few subgroups. This model lets us approximate bounds on the probability q without considering the detectability pro le, but only the number of faults in F d and C k . We later use the pro les of actual circuits and the results demonstrate that our model is very pessimistic, i.e. we should expect better results from actual circuit distributions than from the model approximations. ; jC k+1 j = ev jF d+2 j = e 1 2 jF d+1 j ; jC k+2 j = ejC k+1 j jF d+3 j = ejF d+2 j ; jC k+3 j = ejC k+2 j jF d+4 j = ejF d+3 j ; jC k+4 j = ejC k+3 j jF d+5 j = ejF d+4 j ; jC k+5 j = ejC k+4 j: With this distribution, the product of the probability of detecting all the faults in F d+2+i (0 i 3) is (much) greater than the probability of detecting one fault in F d+2+i?1 . The probability of detecting all the faults in F d+1 is greater than the probability of detecting all the faults in F d .
The same applies to the faults in the subgroups fC j g. Thus, we can approximate the bound on the probability of detecting all the faults by 0 The upper bound on the probability that a random test sequence of length 2L achieves 100% fc is better than 1 50 . This result, of course, is dependent on our choice of values for u and v. Irrespective of these values is the e ect that doubling the test length has on the probability of 100% fc. This e ect is a result of the exponential decrease in the probability of not detecting a fault. While this detectability pro le seems arti cial, experiment circuits show it to be very pessimistic, i.e. actual pro les give rise to detection probabilities that are better than the model's bounds. 4 Experimental results on nding short pseudo-random test sequences
We synthesized 13 circuits from the Berkeley 4] benchmarks as multilevel circuits. For each circuit we found its k value. Using a modi ed version of the ATALANTA 10] test generation system, we generated a list of all the non-equivalent faults of the circuit and proceeded to generate all possible test patterns for each fault. If the pattern count exceeded 2 k+5 , we discarded the fault and stopped the test generation procedure for the fault. Otherwise, we recorded the number of test patterns for the fault. Having iterated through all the faults, we found the respective sizes of the subgroups fF j g and fC j g. We then used Equation (5) to compute upper and lower bounds on the probability of nding a test sequence of length L = 2 n?k which detects all the faults. These results are presented in Table 1 . The rst row for each circuit is the number of faults in the subgroups F k?1 through F k+5 and the second row is the number of faults in the subgroups C k?1 through C k+5 . The column labeled q in each row is the bound derived from the row using Equation (5). In the rst row is the lower bound on q and in the second the upper bound. The column labeled lin. bnd. represents the bounds derived using the linear detectability pro le model (Equation (6)). The values of u and v, the number of faults in F k?1 and C k , respectively, are taken from their entries in the table. When u = 0, as is the case for the rst eight circuits, we cannot calculate the lower bound, hence the entry is left blank. Notice that only for circuit in5, where the number of faults in C k was only 1, did the model give a bound that was higher than the one given by Equation (5). We also computed the bounds on the probability that a sequence of length 2L will detect all the faults. The results are in Table  2 .
Having computed the probability bounds, we conduct 100 experiments (for circuit chkn only 50 experiments were conducted) of random selections of polynomials and seeds to produce 100 test sequences. We ran each sequence for at most 2L patterns (for circuit chkn at most 1:2L), stopping whenever 100% detection was achieved. We recorded the number of random sequences of length at most L and of length between L and 2L that detected all the faults. The results are in Table 3 . The rst column shows the expected number of sequences of length at most L that detected all faults. These numbers are based on the probability values from Table 1 . The second column shows the actual number of such sequences. Column three shows the number of sequences of length greater than L and at most 2L that detected all faults. Column four gives the total number of sequences of length at most 2L that detected all faults and column ve gives the expected number of such sequences, based on the probability values from Table  2 . For 19 of 24 cases (omitting circuit chkn), we have both expected and actual results. For 9 of the 19 cases, the actual results were in the expected range. Omitting the 5 cases in which expected and actual results were 0, of the remaining 4 cases, in 3 the actual result was closer to the higher expected value and in 1 case it was closer to the lower expected value. Of the 10 cases in which the actual value was not in the expected range, in 8 cases the actual results were higher than the high end of the expected range.
Our rst conclusion, based on the empirical results, is that the probability bounds given by Equation (5) are fairly accurate. The fact that the actual results were usually in the high end of the expected range can be explained by the fact that (1) we used pessimistic assumptions to derive Equation (5) and (2) certain correlations between the detectability of some faults may exist but were not considered.
Our second conclusion from both the analytic and empirical results is that the bounds given by the linear distribution model are overly pessimistic and Equation (5) will give more optimistic results.
Given that the actual results tended to be in the high end of the expected range, we conclude that with only a few random selections sequences of length at most 2L can be found that produce 100% fc.
Discrete logarithms and LFSR sequences
Up to this point it was shown that the probability that a pseudo-random test sequence of length at most 2L achieves 100% fc is typically greater than 1 50 . In the sequel we show a more sophisticated way of selecting the primitive feedback polynomial and seed of the LFSR-based PG which nds shorter sequences in less or competitive time. Given a speci c feedback polynomial, we guide the search for the optimal seed by using the theory of discrete logarithms.
When initialized to a non-zero state, a n-stage LFSR with a primitive feedback polynomial cycles through all 2 n ? 1 non-zero binary n-tuples. When the feedback polynomial is changed from one primitive polynomial to another the order in which the patterns appear changes. Hence, a n-stage LFSR with a primitive feedback polynomial de nes a permutation over the non-zero binary n-tuples, each polynomial corresponding to a di erent permutation.
Given a subset of binary n-tuples, the minimum subsequence of a LFSR needed to cover all the tuples of the subset will vary depending on the permutation de ned by the feedback polynomial. If the position of the tuples in the permutation was known, it would be straight forward to nd the minimum covering subsequence. The position of a tuple in a sequence can be obtained from the theory of discrete logarithms.
Let f(x) = P n i=0 f i x i = x n + h(x) be a primitive polynomial of degree n over GF 2] and let be a root of f. The non-zero elements of GF 2 n ] can all be expressed as distinct powers of , i.e. 8 6 = 0 2 GF 2 n ]; 9 0 j 2 n ? 2 s:t: = j (7) Since is a root of f we have n = h( ), hence can be represented as a unique non-zero polynomial in of degree less than n.
If is the j-th power of , j is said to be the discrete logarithm of to the base . The discrete logarithm problem can be stated as follows: given (equivalently, given f) and , nd j.
Discrete logarithms relate to the order in which patterns are generated by a LFSR as follows. Consider a LFSR with f as its feedback polynomial, as in Figure 1 . We number the cells D 0 ; D 1 ; : : : ; D n?1 , with the feedback value coming out of cell D n?1 and feeding D i i f i = 1. Consider the following mapping between non-zero patterns of the register and non-zero polynomials in of degree less than n. A \1" in cell D i represents i . Since every non-zero element of GF 2 n ] corresponds to a unique polynomial in of degree less than n, the non-zero elements of GF 2 n ] correspond to the patterns of the register. Assume the initial state has a \1" in the least signi cant cell and 0 in all other cells. A shift will move the \1" to the second cell, corresponding to multiplication by . After the n-th shift, the \1" is fed back, corresponding to substituting h( ) for n . Thus, the j-th pattern, interpreted as a polynomial in , represents the element j . If we want to know when a pattern will appear we need to nd the discrete logarithm of the element corresponding to the pattern.
Example 1: Assume we want to know when the pattern 0011 will appear in the LFSR of Thus, if we seed the LFSR with the pattern 1000 and let it run for 6 cycles we will get the pattern 0011.
We considered two algorithms for solving the discrete logarithm problem. The rst is due to Pohlig and Hellman 17] and the second is due to Coppersmith 5] . An excellent expository of these two algorithms and of the discrete logarithm problem can be found in 16].
To avoid a lengthy discussion on the exact analysis of the computational complexity of these two algorithms we refer the reader to the above references. We just state the issues which in uenced our decision to prefer the Pohlig-Hellman algorithm over Coppersmith's. While Coppersmith's algorithm has an asymptotic run time which is better than the Pohlig-Hellman algorithm, it is also much more complex. The computational complexity of the Pohlig-Hellman algorithm is dependent on the size and multiplicity of the prime factors of 2 n ? 1 (the number of non-zero elements in the eld of computation). When the number of circuit inputs is less than 64, which is true for the cases we are targeting, the prime factors of 2 n ? 1 are small enough (except for n = 62,61,59, 49,41,37,31) for the Pohlig-Hellman algorithm to run faster than Coppersmith's. 6 The test embedding problem
In this section we de ne and present an algorithm to solve the test embedding problem. We analyze the computational complexity of our algorithm and discuss a strong limitation. We then present a way to overcome this limitation.
The test embedding problem is de ned as follows. Given (1) a set of faults F = ff 1 ; : : :; f s g, (2) a set of test patterns T = s i=1 T i , with T i = ft i;1 ; : : :; t i;n i g being the set of all tests for fault f i , and t i;j being a binary n-tuple, and (3) a primitive polynomial, p, of degree n, nd a minimum length subsequence of patterns generated by the corresponding LFSR that includes at least one test pattern for each f i in F.
When given a set P = fp 1 (x); : : :; p u (x)g of primitive polynomials we would like to select the polynomial that generates the shortest such subsequence.
Having found the polynomial, the initial state of the LFSR and the sequence length, we have embedded a test set for F in the sequence that will be generated by the LFSR. The resulting test sequence is referred to as the embedded (test) sequence. The faults in F are referred to as embedded faults and the test patterns of T are the embedded test patterns.
To solve this problem we propose a two-stage procedure, referred to as the embedding procedure (EP). The rst stage is referred to as the discrete logarithm stage and the second stage is referred to as the windowing stage.
In the discrete logarithm stage we rst nd the logarithm of all the test patterns t i;j with respect to a root of the primitive polynomial p. We then sort the logarithms (i.e. sort the test patterns in order of appearance) and for each logarithm we create a list of faults detected by the corresponding pattern.
In the windowing stage the idea is to use a sliding window on the cycle of logarithms, identifying those windows that detect all the faults, and selecting the smallest window (or one of the windows of shortest length when there are more than one). The pattern corresponding to the rst logarithm of the chosen window is the initial seed for the LFSR. The outline of the procedure for this stage is given in Figure 2 .
The array covered ] keeps track of the number of patterns in the window that detect each fault. The counter not covered keeps count of the number of faults that are not detected by the patterns in the window. The array log table ] stores the ordered logarithms, while the array pattern table ] stores the patterns corresponding to the ordered logarithms. The procedure begins with the window containing the rst logarithm. It is then extended, until all the faults are detected. The size of the window is recorded. It is then compared with the previous best and the smaller of the two is kept. At the end of each iteration the tail is advanced. In the following iteration the head is adjusted, if necessary, so that the window detects all the faults. The procedure terminates when all the logarithms have been considered as tails.
Denote the ordered logarithm cycle by lg 1 ; lg 2 ; : : :; lg jTj where lg i is followed by lg i+1 and lg jTj is followed by lg 1 . Denote the window whose tail is lg i by w i and denote the head of w i by h i . We have the following Lemma.
Lemma 1: The head of w i+1 , h i+1 , is in the subcycle h i : : : lg i+1 ), i.e. it is not in the subcycle lg i+1 : : : h i ).
Proof: Assume h i+1 is in the subcycle lg i+1 : : : h i ), then the window lg i : : :h i+1 ] detects all faults, contradicting the minimality of w i .
As a result of Lemma 1, Procedure window() nds the shortest window associated with each tail. Since the procedure considers windows beginning at all the logarithms and considers the logarithms of all the test patterns for all the faults, the procedure nds a smallest subsequence that detects all faults.
The complexity of Procedure window() is a function of the number of tail and head movements taken. There are at most jTj logarithms, hence at most jTj tail movements. Each window contains at most jTj logarithms, hence there are at most 2jTj head movements. For each tail or head movement there are at most jFj accounting operations that are needed, hence the complexity is O(jTjjFj).
The complexity of EP is given by O(PPDL + jTj(DL + log jTj + jFj)), where PPDL is the preprocessing e ort required for the Pohlig-Hellman algorithm, DL is the time required for one logarithm computation and jTj log jTj is the time required to sort the logarithms.
As mentioned in the previous section, for the cases we are targeting PPDL and DL will not require much e ort. The bulk of the work required by PPDL is the construction of a hash The major factor in the complexity expression is the number of test patterns, jTj, which might be overwhelming, rendering the algorithm impractical. We must, therefore, nd a way to limit the size of T. This is done in two ways, based on a user-set limit on the e ort allowed for embedding a fault. The rst is by considering only a subset of all possible faults, those which will be classi ed as hard. For some circuits the set of hard faults will be empty. These circuits are classi ed as randomly testable circuits and no embedding is done for them. A short test sequence for these circuits is found by random selections. The second is by limiting for each hard fault the number of test patterns that will participate in EP (whose logarithms we compute). This limit will typically apply to only a small subset (if at all) of the hard faults. In the next section we describe our heuristic for identifying hard faults.
Identifying hard faults
The amount of work needed for EP is strongly a ected by the number of test patterns. We have to modify the procedure in a way that will reduce the test set to a manageable size. The modi cation is based on partitioning the set of faults in a circuit into two sets -randomly testable faults and random resistant faults, with each set being possibly empty. By randomly testable faults we mean faults that can be detected by choosing a primitive feedback polynomial and an initial seed at random, and using a test sequence of acceptable length (this term will be de ned later). By random resistant faults we mean faults that cause the test length (for 100% fault coverage) to dramatically increase beyond the acceptable length when the feedback polynomial and the initial seed are chosen at random. This partition is justi ed by the super-exponential decrease in the probability that a fault is missed by a random test sequence as the number of test patterns for the fault is doubled.
Our thesis is that by proper identi cation of the random resistant faults (referred to as hard faults in the sequel), and by embedding test patterns for each of these faults in a LFSR subsequence, the subsequence will also detect all the randomly testable faults.
As was shown in Section 2, the probability that a random sequence does not detect a fault in C k or C k+1 is (much) greater than the probability that the sequence does not detect any other fault, hence any procedure that identi es hard faults must be able to identify the faults in both these subsets. We refer to the union of C k and C k+1 as HF.
Before presenting our procedure for identifying hard faults, we present a brief discussion on the applicability of our algorithm. Every circuit can be associated with parameters (n; k). These parameters determine the work factor for EP and the predicted length of the embedded test sequence. We expect the embedded sequence to be of length between L=2 = 2 n?k?1 and L = 2 n?k . When embedding test patterns we ensure that in the resulting test sequence the embedded faults will be detected and we rely on chance that the non-embedded faults will also be detected. It follows that we must embed the faults in HF. Assuming little overlap between the test sets for each of the faults, the embedded test set is at least of order jT emb j = 2 k jC k j + 2 k+1 jC k+1 j.
The size of T emb increases with k, whereas the predicted test length decreases. The applicability of our algorithm depends on the amount of preprocessing work (embedding) we are willing to do and the amount of time we are willing to allow the test session (test length).
The bound we place on the amount of pre-processing work determines what we consider as a hard circuit and what we consider as an easy circuit. If we allow each embedded fault at most 2 test patterns, then for a circuit for which k = we expect to nd an embedded sequence of length longer than 2 n? ?1 . Thus, if for a given circuit we nd a random test sequence of length less than 2 n? ?1 , we consider the circuit to be randomly testable, i.e. easy. In our experiments we chose to allow each embedded fault at most 2 13 ? 2 14 test patterns. This led us to classify circuits as easy when we found a random test length that was less than 2 n?15 (the acceptable length).
The bound we impose on the test length determines whether or not the resulting embedded sequence is applicable to a circuit. For example, if for a 32 input circuit we nd that the hardest faults have 2 6 test patterns, we expect to nd a test sequence of length 2 25 ?2 26 . If the maximum allowed test length is 2 20 , then our scheme is not practical for this circuit. On the other hand, no other scheme that uses just one seed and no recon guration of the LFSR will detect 100% of the faults within the imposed time constraints. We denote the maximum allowed test length by 2 max .
In the remainder of this section we present our heuristic for identifying hard faults followed by a probabilistic analysis of its e ectiveness. The experimental results validate the proposed process.
7.1
The heuristic
The process by which we identify hard faults (referred to as IHF) is made of three phases. In the rst phase we determine which of the faults of interest are redundant. They are eliminated from the set. The second and third phases are based on simulation (sampling) experiments. The goal of the second phase is to determine an estimate l for k. The goal of the third phase is to identify the random resistant faults using our estimate for k. The second phase should be a fast process that comes as close as possible to identifying k. The outline of this phase is given in Figure 3 . The procedure determines a cuto value, undet, which we set to be the minimum between 50 and 5% of the irredundant faults. It applies random test sequences on the circuit, with increasing lengths, until a sequence that misses at most undet faults is found. The initial sequence length is 2 j = 2 n?15 . It is understood that (n ? 15) max, otherwise the procedure will not result in an embedded test sequence we are willing to use, hence there is no sense in carrying it out. If during this iteration a sequence is generated that detects all faults then a satisfying embedding for all the faults is found and the circuit is considered easy. Otherwise, j is incremented and another fault simulation is performed. This process is repeated (each time simulating all faults) until the number of undetected faults is less than undet or until j is greater than max. If j > max the procedure stops, it will not produce a satisfying test sequence. Otherwise four more fault simulation experiments of length 2 j are conducted. For each of the ve experiments the set of undetected faults is recorded and at the end of the experiments the union of these sets is constructed. For each fault in the union all test cubes are generated in order to nd the fault with the fewest number of test patterns. 1 If the number of test patterns for this fault is t, then the estimate of k is l = dlog te. Remark 1: Although test generation is considered a hard problem, and, a fortiori, generation of all test patterns, this proved to be a very low cost task in our experiments, where the faults of interest were single stuck-at faults.
After the execution of Procedure second phase(), a circuit is classi ed as either easy or hard. It is classi ed as easy if one of two conditions is met: (1) a random test sequence of length less than 2 n?15 was found that detects all faults; or (2) the estimate l is greater or equal to 15. In all other instances a circuit is classi ed as hard.
For circuits classi ed as hard, we turn to the third phase. We run a set of fault simulation experiments (we ran 20) with L=2 = 2 n?l?1 test patterns. The set of faults which are not detected by at least one experiment constitute the set of embedded faults (EF ). These are the faults for which we embed test patterns. It is essential that the faults in EF include all the faults of HF. The parameters used in selecting the number of simulation runs in phases two and three were chosen to ensure that the probability that EF includes HF is very close to 1 (this is shown in the next section). If a fault f is included in EF, we say it is classi ed as hard. Otherwise, we say it is classi ed as easy. Note that a fault that is classi ed hard is not necessarily in HF. EF will usually include faults outside of HF that will nonetheless be embedded. This is a price we pay for a fast classi cation procedure.
Remark 2: We conduct 5 fault simulation experiments in phase two in order to increase the probability that at least one fault of C k will be in the union of the undetected faults. It is important that l be as close as possible to k because this will reduce the number of faults we have in EF (the smaller l is, the longer the simulation lengths in phase three are, decreasing the probability of not detecting a random fault, hence fewer faults are in EF), thus reducing the number of test patterns we need to embed. While this also reduces the probability of detection for some of the faults (as will be seen in the next section), it is a tradeo that must be made.
In the sequel, whenever EP is mentioned, it is understood that IHF was used in a preprocessing step (stage 0 if you will) to create a reduced target fault set.
Probabilistic analysis of the heuristic
The purpose of IHF is twofold. First, to nd all the faults in HF. Second, by our thesis, embedding test patterns for these hard faults results in a test sequence that detects all the faults of interest. We must answer two questions regarding our heuristic.
1. What is the probability that a fault that should be classi ed as hard (i.e. has less than 2 k+1 test patterns) is classi ed as easy?
2. What is the probability that an easy fault will not be detected by the embedded sequence?
In answering these questions we assume that the embedded sequence is made of a totally random set of patterns, although, given that petterns are generated by a LFSR, this might not be completely accurate.
In answering the rst question we compute the probability that a fault with t test patterns will not be classi ed as hard by IHF. We will be interested in the probability value when t is less than or equal to 2 l+1 , where l is the IHF estimate for k.
Let p miss (2 n?l?1 ; t) be the probability that a test sequence of length 2 n?l?1 does not detect a fault with t test patterns. This would require that the rst pattern not detect the fault, the second not detect the fault, and so on. Assuming the patterns are completely random, uncorrelated, with selections done without replacement and the all zero pattern does not participate in the drawing of patterns (and is not one of the test patterns), we have p miss ( Let #sim denote the number of simulation experiments. The probability, p all det (2 n?l?1 ; t), that a fault with t test patterns is detected by all the experiments is p all det (2 n?l?1 ; t) = (p detect (2 n?l?1 ; t)) #sim : p all det (2 n?l?1 ; t) is also the probability that a fault with t test patterns will not be classied as hard. This probability can be increased or decreased by changing the number of simulation experiments. In our experiments, p all det (2 n?l?1 ; 2 l ) ranged from 7:9 10 ?9 to 10 ?8 . p all det (2 n?l?1 ; 2 l+1 ) ranged from 10 ?4 to 1:25 10 ?4 . This suggests that with very high probability EF will include all the faults in HF. EF will typically include some other faults, those with higher detection probabilities than the faults in HF, thus although EF might vary from one run of EP to another, all sets will share the same core of \hardest" faults.
We turn to the second question, namely what is the probability, p nd (el; t), that an embedded sequence, es, of length el does not detect a fault f, where there are t possible test patterns for f. If the fault is classi ed as hard than the embedded sequence is guaranteed to include a test pattern for the fault. Hence for a fault to be undetected it must be classi ed as easy and a random sequence of length el must fail to detect it. Thus, p nd (el; t) = p miss (el; t) p all det (2 n?l?1 ; t):
The probability p nd (el; t) is a product of two functions of t. But whereas one of these functions (p all det ) increases with t, the other (p miss ) decreases. As a function of t, p nd has a maximum value and no local maxima 12], i.e. as t increases, p nd rises, reaches a maximum value and decreases from there on. The overall probability that the embedded sequence detects all the faults is dependent upon the probability of occurrence of the faults whose number of test patterns is in the peak area. The values of p nd (el; t) for t = 2 j , l j l + 5, from our experiments are tabulated in Table 9 . The peaks can clearly be seen.
Experimental results
Experiments were conducted using a modi ed versions of the ATALANTA fault simulation and test generation system 10]. These modi cations included fault simulating random patterns generated by LFSRs with primitive feedback polynomials, and generating all test cubes per fault (or as many as ATALANTA could nd, depending on the allowed backtrack limit) instead of stopping once the rst test pattern is found.
The ISCAS85 circuits did not provide a good test bed for our algorithm 12]. Circuits c2670 (n = 233), c5315 (n = 178), and c7552 (n = 207) had too many inputs to make test embedding practical. They will either be characterized as easy or, if characterized as hard a one-seed test sequence will be too long to be a viable test option. The remaining circuits were classi ed as easy and short sequences were easily found. For example circuits we turned to the Berkeley circuits 4]. We synthesized circuits with 20 to 40 inputs as multilevel circuit. Some circuits were eliminated for pathological reasons.
The characteristics of the synthesized circuits are given in columns two (#pi) to six (#ir. ts) of Table 4 . Column eight (sp time) is the CPU time, in seconds, required to run the second phase on a SPARC 2 workstation. The second and third phase of IHF were combined for circuit xparc. The reason for this was the long simulation time required for this circuit. Thus, no time is reported for this circuit in Table 4 .
To identify the randomly testable circuits from the random resistant ones we ran each circuit through IHF. The simulation experiments of the second and third phase of IHF used a pool of 150 primitive polynomials. For a circuit with n inputs we used the rst 150 primitive polynomials of degree n, when ordered lexicographically (i.e. (x n + x + 1) > (x n + 1)).
The results of Procedure second phase() on these circuits is given in column seven (l) of Table  4 . For each of the easy circuits we conducted a more thorough search for k. For each circuit we ran 10 random selection experiments (referred to as RS experiments in the sequel) to nd a short pseudo-random test sequence. In each RS experiment a primitive feedback polynomial and initial seed are chosen at random and a test sequence is generated. We allowed each experiment to generate at most 2 n?l+1 test patterns, where l is the estimate for k. The results of these experiments are in Table 5 . The value of n ? l is given in column two. The best (shortest sequence) and worst (longest sequence) results are given in columns three and four. None of the worst sequences achieved 100% fault coverage. The total CPU time needed for these experiments was 1 ? 2 minutes, as reported by ATALANTA on a SPARC 1 workstation. The best random sequence that was found was always shorter than 2 n?l+1 (2L). Also, n ? l + 1 was always less than n ? 15 (column ve).
For the circuits classi ed as random resistant circuits we tried to nd short test sequences in two ways. The rst was by RS experiments and the second was by continuing our embedding procedure (the rst two phases of IHF were already executed).
The purpose of nding a minimum sequence in two ways was to evaluate the quality of the result of EP in terms of test length and processing time.
We ran 100 RS experiments for each circuit, except for circuit chkn for which we ran just 50 experiments and circuit xparc for which we ran just 3 experiments. The results of these experiments are in Table 6 . Column rs time represents the total CPU time (read as hrs : min) needed for these experiments, as reported by ATALANTA on a SPARC 2 workstation. We allowed each RS experiment to run for at most 2 n?l+1 test patterns. For all the circuits, the worst sequence lengths are the maximum allowed lengths per experiment and all such sequences did not detect all the faults. Comparing the best sequence length and the value 2 n?l , we see that the best value was less than 2 n?l (L) for 8 of the circuits, while it was between L and 2L for the remaining 6 circuits.
For the random resistant circuits we completed the execution of IHF and proceeded to embed the test patterns for these faults. In Table 7 , the results of IHF are shown. For each circuit, we show the total number of irredundant faults, the number of faults that were classi ed as hard and the percentage of hard faults. The total number of faults that were classi ed as hard, over all circuits, make up 7% of the total number of faults in the circuits. Thus, when also considering that the hard faults are those with the fewest test patterns, several orders of magnitude of reduction in the work e ort is achieved when embedding only the hard faults.
In Table 8 we show the results of EP for each of the random resistant circuits. The second column shows the total number of embedded patterns, i.e. the union of the sets of test patterns for the hard faults. Column three shows the number of di erent primitive polynomials considered, i.e. the minimum length embedded sequence was computed for each of these polynomials. The polynomials we used were the rst in the lexicographical ordering. Columns four and ve give the best and worst embedding results for these polynomials, and column six gives the total CPU time (read as hrs : min) required for EP, using an HP-700 workstation. When comparing the best sequence length with the value 2 n?l for each of the circuits (see Table 6 ), we see that the length of the best embedded sequence was always less than 2 n?l except for the circuit exep. For six of the circuits, the worst sequence length was also less than 2 n?l . When comparing the worst embedding length with the best RS length, for 6 of the circuits the worst embedding length was shorter than the best RS length.
From the embedding length for each circuit we can calculate the probability p nd (el; t) that an embedded sequence of length el will not detect a fault that has t test patterns, for various values of t. This is shown in Table 9 . The values in the parenthesis in each table entry are exponents of 10. For example, the value of p nd (el; 2 l ) for circuit bc0 is 5:67 10 ?9 . Depending on the values, one might decide that the probability of not detecting a fault is small enough such that veri cation by simulation is not necessary.
Having performed the RS experiments and the embedding experiments for the random resistant circuits, we proceeded to compare the results in terms of the shortest sequence length found and the processing time for the experiments. These comparisons are shown in Table 10 . Columns two and three give the best test length and the time required for the RS experiments.
Columns four and ve give these values for the embedding experiments. In column six we give the ratio between the best RS sequence length and the best embedding sequence length. Column seven gives the ratio between the processing time of the RS experiments and the embedding experiments. These ratios were normalized, so they would re ect the ratio if both experiments were run on a SPARC 2. To nd the normalization factor we ran EP for circuit chkn on a SPARC 2 workstation and compared the run time with the run time for the same procedure and circuit on the HP-700. The normalization factor came out to be 2. Columns eight, nine and ten show the values of l, n ? l, and n ? 2l respectively.
The rst fact we can notice from Table 10 is that the test length resulting from EP is always shorter than the test length resulting from the RS experiments. The second fact we notice is that the ratio of the processing times required by the two procedures varies. This can be explained as follows. The factor that a ects the processing time for EP is the number of embedded patterns, which is strongly a ected by the value of l. EP will require less time for circuits with lower values of l (omitting the obvious in uence of the number of embedded faults and the number of polynomials). The factor that a ects the processing time for the RS experiments is the length of each experiment, i.e. the value of (n ? l). The higher the value of l, the lower the value of (n ? l), resulting in lower processing time for the RS experiments. Notice that for the eight circuits whose l-value is less than 10, the time ratio is greater than 1 for ve and greater than 0:5 for all eight. In absolute time, EP required an hour or less for seven of the eight and an hour and a half for the eighth. Of the eight circuits for which the value n ? 2l was greater than 10, for six of the eight, EP required less time than the RS experiments. De ne the product of the test length ratio and the processing time ratio (columns six and seven) as a measure of EP e ectiveness. When this measure is greater than 1 then any relative advantage RS has over EP in either length or time is negated by the bigger advantage EP has in the other. Except for circuit bc0 (whose measure was 0:924), the circuits with n ? 2l > 10 all had e ectiveness measures greater than 1. Of those, only vg2 had a time ratio less than 1. Circuits in7 and in3 also had e ectiveness measures greater than 1. Altogether, 9 of the 14 circuits had e ectiveness measures greater than 1.
Remark 3: The time reported for EP is the time required for the logarithm computations. It does not include the time required for phase three of IHF, the time it took to generate the test patterns and sort them before computing the logarithms, nor the time for Procedure window(). This is because these times were very small when compared with the time required by the RS experiments or the time required for logarithm computations. The time required for phase three was always between 5% ? 10% of the RS time. For circuits b3 and in3, generating and sorting the test patterns took just over 2 : 30 minutes on a SPARC 2 and Procedure window() took just over 4 : 00 minutes for b3 and less than that for in3 on an HP-700. These times are insigni cant when compared with the RS and logarithm computation times.
One can ask whether we really needed all the 100 RS experiments, or would 20 or 50 have been enough to produce a test sequence of length comparable with the one found by EP. A partial answer to this can be found in Table 11 . The columns of Table 11 represent ratios between the sequence lengths from the RS experiments and the best embedding result. An integer entry in position (i; j) indicates the number of experiments for circuit i where the ratio was between the value of the headings for the j-th and (j + 1)-th columns. For example, for circuit cps there was one RS experiments whose ratio was between 1:50 and 1:75. The last non-zero entry in each row includes all the experiments whose ratio was greater or equal to the ratio of the column it sits in, e.g. for circuit in4 70 experiments had a ratio greater than 2:50. These 70 experiments include all those that ran for the maximum allowed time and failed to detect all the faults. The results in this table tend to support the notion that we didn't get \lucky" with random selections and usually all the experiments were necessary to ensure that we nd a short sequence length.
For those circuits whose time ratio was less than one, we conducted additional RS experiments to bring the ratio up to one. The results of these experiments are in Table 12 . For four of the eight circuits the additional experiments did not nd shorter sequences. For two circuit, the additional experiments found shorter sequences, but still longer than the embedded sequence. For two circuits the additional experiments produced two sequences that had shorter lengths than the embedded sequences. Looking closer at these circuits, the embedding were done relative to only two and three polynomials and their length ratio relative to the rst 100 RS experiments was poor to begin with. Thus, there was already indication that the polynomials used for the embeddings were poor choices.
Remark 4: As mentioned in Section 7, our thesis is that the test sequence found for the hard faults will also detect the easy faults. This was true in all the embedding experiments except for one polynomial for the circuit in3, in which the embedded sequence missed nine irredundant faults, and three polynomials for circuit in7, of which two sequences missed one fault and one sequence missed eleven faults.
Remark 5: For circuits in3, exep, b3, in4, in5, in7 and xparc, some of the hard faults had more than the maximum allowed number of test patterns (2 14 for in3, 2 12 for exep and 2 13 for the others). For these faults we embedded only the maximum allowed number of tests. This caused the predicted length (the distance between the head and tail of the shortest window) for some of the embeddings to be greater than the actual length. This is a result of the fact that we did not embed all possible tests, but only a subset. Additional patterns were found in the embedded sequence, rendering some of the embedded patterns unnecessary, hence the shorter length. Thus, when embedding only subsets of test sets, the length of the embedded sequence is an upper bound on the actual test length. When the complete test sets are embedded the length of the embedded sequence is the actual test length.
Conclusions
In this paper we address the question of nding short pseudo-random test sequences that achieve 100% fault coverage for LFSR-based PGs. We rst show that if the probability of detecting the hardest fault in the circuit is p, then the probability that a pseudo-random test sequence of length 2 p will achieve 100% fault coverage is typically greater than 1 50 . We then present an algorithm for embedding test patterns in test sequences generated by LFSRs. The algorithm is based on the theory of discrete logarithms. It produces a one-seed test sequence that detects all the faults of interest. The algorithm can also embed two-pattern tests and can also embed test patterns in sequences generated by CAs. The advantage of our approach over existing schemes that achieve 100% fault coverage is the low overhead we incur in terms of both circuit area and delay.
The applicability of the embedding algorithm depends on two user speci ed constraints. The rst is the computational e ort one is willing to expend (the parameter in Section 7.1) and the second is the length of the test session one is willing to allow (the parameter max in Section 7.1). Thus, for circuits where the hardest faults have more than 2 test patterns the computational e ort required will be too high and for circuits with too many inputs, i.e. when n? ?1 > max, the resulting sequence will be too long to be useful.
We nd short test sequences either by the embedding algorithm or with random selections. This is decided by classifying circuits as either randomly estable or random resistant. This classi cation is based on the number of test patterns for the hardest fault in the circuit. If the logarithm of this number, denoted by k, is greater than , or if during the classi cation process a random sequence of length shorter than 2 n? ?1 that achieves 100% fc is found, the circuit is classi ed as randomly testable. In all other cases it is classi ed as random resistant.
For randomly testable circuits our probabilistic and empirical results show that short test sequences can be found with few random selections.
For random resistant circuits we compare the results achieved by the embedding algorithm with those achieved by random selections of primitive polynomials and seeds for the LFSRs. In all cases the sequences found by our algorithm are shorter than those found by the random experiments. In almost half the cases, our algorithm is also faster than the random experiments. When the PG register is also to function as a RA, by considering only polynomials that achieve zero-aliasing 13] as candidates for EP, both objectives are satis ed with one polynomial.
The circuits which we use in our experiments are relatively small (only several hundred gates). Most of the e ort for our algorithm is spent on logarithm computations, and only a small portion is spent on simulation. As circuit sizes increase, the cost of our algorithm will be only slightly a ected whereas the cost of the random experiments will increase. Therefore, we expect the e ectiveness of the algorithm to increase as circuit size increases. Table 4 : Characteristics of synthesized circuits. Table 9 : p nd (el; t) values for t = 2 j , l j l + 5.
Circuit (n; l) (el; 2 l ) (el; 2 l+1 ) (el; 2 l+2 ) (el; 2 l+3 ) (el; 2 l+4 ) (el; bc0  1  1  2  3  93  b3  2  1  1  1  3  92  chkn  1  2  47  cps  1  0  4  4  91  exep  1  5  8  86  in3  100  in4  2  3  6  6  8  5  70  in5  1  3  1  3  92  in7  100  vg2  2  98  vtx1  3  0  2  1  94  xparc  1  2  x1dn  2  98  x9dn 2 98 
