Testing system-on-chips involves applying huge amounts of test data, which is stored in the tester memory and then transferred to the chip under test during test application. Therefore, practical techniques, such as test compression and compaction, are required to reduce the amount of test data in order to reduce both the total testing time and memory requirements for the tester. In this article, a new approach to static compaction for combinational circuits, referred to as test vector decomposition (TVD), is proposed. In addition, two new TVD based static compaction algorithms are presented. Experimental results for benchmark circuits demonstrate the effectiveness of the two new static compaction algorithms.
INTRODUCTION
Advances in the VLSI technology paved the way for System-on-Chips (SoCs). Traditional IC design, in which every circuit is designed from scratch and reuse is limited only to standard cell libraries, is more and more replaced by the SoC design methodology. However, this new design methodology has its own challenges. A major challenge is how to reduce the increasing volume of test data. Basically, there are two approaches: compression and compaction. In the first approach, test data is kept compressed while it is stored in the tester memory and transferred to the Chip Under Test (CUT). Then, it is decompressed
The authors would like to thank King Fahd University of Petroleum & Minerals for support. Authors' addresses: A. H. El-Maleh, KFUPM, P.O. Box 1063, Dhahran 31261, Saudi Arabia; email: aimane@ccse.kfupm.edu.sa; Y. E. Osais, KFUPM, P.O. Box 983, Dhahran 31261, Saudi Arabia; email: yosais@ccse.kfupm.edu.sa. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or direct commercial advantage and that copies show this notice on the first page or initial screen of a display along with the full citation. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works requires prior specific permission and/or a fee. Permissions may be requested from Publications Dept., ACM Inc., 1515 Broadway, New York, NY 10036 USA, fax: +1 (212) 869-0481, or permissions@acm.org. C 2003 C ACM 1084 C -4309/03/1000 on the CUT. This reduces the memory and transfer time requirements. In the second approach, however, the objective is to reduce the size of a test set while maintaining the same fault coverage.
Test compaction techniques are classified into two categories. The first category includes algorithms that can be integrated into the test generation process. Such algorithms are referred to as dynamic compaction algorithms. On the other hand, the second category includes algorithms that are applied after the test sets are generated. Such algorithms are referred to as static compaction algorithms. There are several approaches to static compaction of a given test set as will be shown in the next section.
Since test application time is proportional to the length of the test set that needs to be applied, it is desirable to apply shorter test sets that provide the same fault coverage at reduced test application time. Although static compaction algorithms do not typically produce test sets of sizes equal to those generated using dynamic compaction, interests in developing more efficient static compaction algorithms have increased [Miyase et al. 2002] . Static compaction has the following advantages over dynamic compaction. First, generating smaller test sets using dynamic compaction is time consuming because many attempts to modify partially specified test vectors to detect additional faults often fail [Miyase et al. 2002] . Second, dynamic compaction does not take advantage of random test pattern generation. Third, static compaction is independent of ATPG.
Given a test set T with single stuck-at fault coverage FC T for a combinational circuit, the static compaction problem can be formulated as to find another test set, T * , for the same circuit such that FC T * ≥ F C T and |T * | < |T | [Chang and Lin 1995] . It should be pointed out that in the above definition, there is no constraint on the individual fault coverage of each test vector and the proximity between the test vectors of T and T * . That is, the fault coverage of each test vector needs not remain intact and T * needs not be a subset of T . This article is structured as follows. First, we give a taxonomy of static compaction algorithms for combinational circuits. In this section, we review the existing static compaction algorithms and show how they fit in our taxonomy. Besides, we introduce and motivate the new concept of test vector decomposition. Then, we describe two new static compaction algorithms based on test vector decomposition. After that, we present and discuss the experimental results. Finally, we conclude by summarizing the results of the article and their significance.
TAXONOMY OF STATIC COMPACTION ALGORITHMS
In this section, we give a taxonomy of static compaction algorithms for combinational circuits. We first start with an overview of the taxonomy. Then, we give a description of every class in the taxonomy with examples from the literature.
Overview
Static compaction algorithms for combinational circuits can be divided into three broad categories: (1) Redundant Vector Elimination, (2) Test Vector Modification, and (3) Test Vector Addition and Removal. Figure 1 shows our proposed taxonomy. In the first category, compaction is performed by dropping redundant test vectors. A redundant test vector is a vector whose faults are all detectable by other test vectors. Static compaction algorithms falling under this category can be further classified into two classes. The first class contains algorithms based on set covering in which faults are to be covered using the minimum possible number of test vectors. On the other hand, the second class contains algorithms based on test vector reordering in which reordering, fault simulation, fault distribution, and double detection are used to identify redundant test vectors and then drop them.
In the second category, compaction is performed by modifying test vectors. Algorithms belonging to this category can be further classified into three classes. The first class contains algorithms based on merging of compatible test cubes. A test cube is a test vector that is partially specified. A test vector is made partially specified by unspecifying the unnecessary primary inputs. This process is referred to as relaxation. Relaxation can be performed using an ATPG or a stand-alone algorithm, such as El-Maleh and Al-Suwaiyan [2002] and Kajihara and Miyase [2001] . In addition to relaxation, raising can be used to enhance the compatibility among relaxed test vectors. If two relaxed test vectors conflict at one or more bit positions, they can be made compatible by raising one of them at the conflicting bit positions.
The second class contains algorithms that employ essential fault pruning to make some test vectors redundant. A test vector becomes redundant if it detects no essential faults. A fault is essential if it is detected only by a single test vector. Essential faults of a test vector can be pruned, that is, made detected by some other test vectors, by reassigning values to those bits that are originally unspecified and have been randomly assigned values to detect additional faults.
The third class contains algorithms that are based on test vector decomposition. Test vector decomposition is the process of decomposing a test vector into its atomic components. An atomic component is a child test vector that is generated by relaxing its parent test vector for a single fault f. In this article, we propose test vector decomposition as a new class of static compaction algorithms that modify test vectors to perform compaction.
Finally, the third category of static compaction algorithms consists of compaction algorithms that add new test vectors to a given test set in order to remove some of the already existing test vectors. The number of the newly added test vectors must be less than the number of test vectors to be removed. An ATPG is used to generate the new test vectors.
Set Covering
Test compaction for combinational circuits can be modeled as a set covering problem. The set cover is set up as follows: Each column of the detection matrix corresponds to a test vector and each row corresponds to a fault. If a test vector j detects fault i, then the entry (i, j ) is one; otherwise, it is zero. In this setup, the total amount of memory required for building the detection matrix is O(nf ), where n is the number of test vectors and f is the number of faults.
Static compaction procedures based on set covering were described in Flores et al. [1999] , Boateng et al. [2001] , and Hochbaum [1996] . It should be pointed out that this approach has not been used much in the literature due to the huge memory and CPU time requirements.
Test Vector Reordering
Identification of redundant test vectors in a test set is an order dependent process. Given any order, redundant test vectors can be identified using fault simulation, fault distribution, or double detection. There are four variations of Test-Vector-Reordering (TVR)-based static compaction algorithms.
2.3.1 TVR with Fault Dropping Simulation. Fault simulation of a test set in an order different from the order of generation is used as a fast and effective method to drop redundant test vectors. Under Reverse Order Fault simulation (ROF) [Schulz et al. 1988; Pomeranz and Reddy 2001] , a test set is fault simulated with dropping in reverse order of generation. That is, a test vector that was generated later is fault simulated earlier. When it is simulated, a test vector that does not detect any new faults is removed from the test set.
The intuitive reason for this phenomenon is simply that test vectors which are further down the list detect faults which are most difficult to detect. Therefore, if we first fault simulate a test vector that is at the end of the list, it not only detects a hard fault right away, it also detects many others by pure chance. This way hard faults are out of the way early.
TVR with Forward-Looking Fault
Simulation. The forward-looking fault simulation is an improved version of ROF . It is based on the idea that information about the first test vector that detects every fault can be used to drop test vectors that would not be dropped by ROF. That is, the yet undetected faults have lower indexed test vectors that detect them. So, some test vectors are skipped over and consequently dropped from the test set. Let us consider the following example. Let the test set T be {t 1 , t 2 , t 3 } and the fault set F be { f 1 , f 2 , f 3 , f 4 }. Figure 2 shows the test vectors with their associated faults and Figure 3 shows the first test vector that detects every fault. Conventional ROF first simulates t 3 . This test will be retained in the test set to detect f 2 and f 3 . Next, t 2 is simulated. Since it detects the new fault f 4 , it is retained in the test set. Finally, t 1 is simulated and retained in the test set since it detects a new fault, which is f 1 . No tests are dropped by ROF in this case. Now, let us start ROF again taking into account the information given in Figure 3 . ROF starts by simulating t 3 . This test is retained in the test set to detect f 2 and f 3 . Next, t 2 is simulated. t 2 detects the new fault f 4 . However, f 4 is first detected by t 1 . Therefore, we conclude that t 2 is not necessary for the detection of any yet undetected fault and we drop it from the test set. Finally, when t 1 is simulated, the remaining undetected faults f 1 and f 4 become detected and the detection process completes. In this case, one test vector is dropped from the test set.
TVR with Fault Distribution.
In TVR with fault distribution, a test vector is fault simulated without fault dropping. Faults detected by every test vector are recorded. Besides, the number of test vectors that detect every fault is recorded. After that, given any order, a test vector whose number of essential faults is zero, that is, the faults it detects can be distributed among other test vectors, is considered redundant and thus can be dropped. After a test vector is dropped, the number of test vectors that detect every one of its faults is reduced by one.
In Hamzaoglu and Patel [1998] , compaction based on fault distribution was used to compact test sets as a part of a dynamic compaction algorithm. The motive behind the proposed algorithm is the fact that ROF cannot identify a redundant test vector if some of the faults detected by it are only detected by the test vectors generated earlier. ROF can only identify a redundant test vector if all the faults detected by it are also detected by the test vectors generated later.
2.3.4 TVR with Double Detection Fault Simulation. Double Detection (DD) was first proposed in Kajihara et al. [1995] as a dynamic compaction algorithm. Basically, when generating a new test vector, a yet undetected fault, called a primary target fault, is selected and a test vector t is generated to detect it. Next, other faults, called secondary target faults, are selected one at a time and the Lin et al. [2001] . However, since most test generators do not attempt to target faults for a second detection and do not use non-fault-dropping simulation, they do not collect all the information necessary for static compaction based on DD. Therefore, the necessary information must be collected in a preprocessing step.
Merging
Static compaction algorithms in this class can be divided into two groups. Table I ). The new test vector t i • t j has all the binary values of both t i and t j . Hence, by a repetitive application of the above compaction operation, many test vectors (two or more) can be combined into fewer test vectors. As a result the total number of test vectors that need to be applied with the same fault detection capabilities is reduced. Examples of this approach can be found in El-Maleh and Al-Suwaiyan [2002] , Ayari and Kaminska [1994] , and Miyase et al. [2002] .
In the second group, algorithms employ in addition to the relaxation operation a raising operation. For a test vector t, the raising operation raise(t, i) tries to set the ith bit of t to x while preserving the coverage of the essential faults of t. The raising operation was proposed in Chang and Lin [1995] . Raising is used to enhance compatibility among relaxed test vectors. For example, if two relaxed test vectors, say t i and t j , conflict at one or more bit positions, they can be made compatible by raising one of them at the conflicting bit positions. Typically, raising is used to resolve conflicts when a test set contains no compatible test vectors.
Test Vector Decomposition
Test Vector Decomposition (TVD) is the process of decomposing a test vector into its atomic components. An atomic component is a child test vector that is generated by relaxing its parent test vector for a single fault f . That is, the child test vector contains the assignments necessary for the detection of f . Besides, the child test vector may detect other faults in addition to f . For example, consider the test vector t p = 010110 that detects the set of faults
Using the relaxation algorithm in El-Maleh and Al-Suwaiyan [2002] , t p can be decomposed into three atomic components, which are ( f 1 , 01xxxx), ( f 2 , 0x01xx), and ( f 3 , x1xx10). Every atomic component detects the fault associated with it and may accidentally detect other faults. An atomic component cannot be decomposed any further because it contains the assignments necessary for detecting its fault.
Static compaction based on merging is a very simple and efficient technique. However, it has the following problems. First, for a highly incompatible test set, merging achieves little reduction. Secondly, raising is a costly operation. Thirdly, a test vector must be processed as a whole. Therefore, we propose that a test vector be decomposed into its atomic components before it is processed. In this way, a test vector that is originally incompatible with all other test vectors in a given test set can be eliminated if its components can be merged with other test vectors.
By decomposing a test vector into its atomic components, a merging based compaction algorithm will have more degree of freedom. This is because of the fact that the number of unspecified bits in an atomic component is much larger than that in a parent test vector. Thus, the probability of merging a component is higher than that of merging its parent test vector.
The problem of static compaction based on TVD can be modeled as a graph coloring problem. Basically, given a test set T with single stuck-at fault coverage FC T , the set of atomic components C T is first obtained. Then, a graph G is built. In G, every node corresponds to a component and an edge exists between two nodes if their corresponding components are incompatible. Now, our objective is to partition C T into k subsets such that k is as small as possible and no adjacent nodes belong to the same subset. The fault coverage of the new test set T * whose size is k should be greater than or equal to FC T .
It is well known that graph coloring is an NP-hard problem [Garey and Johnson 1979] . Thus, efforts of researchers are devoted to heuristic methods, rather than exact ones. Heuristic methods are simple schemes in which nodes are colored sequentially according to some criteria.
Essential Fault Pruning
Generally speaking, pruning a fault of a test vector decreases the number of its faults by one. A test vector becomes redundant if all of its faults are pruned. Fault Pruning (FP) is implemented as follows. Given a test vector t, an attempt is made to detect each of its faults by modifying the other test vectors in the test set. A fault of t is said to be pruned if it becomes detected by another test vector after the modification. If all the faults of t are pruned, then t can be removed from the test set.
The above operation of modifying a test vector, say t , to further detect an additional fault f of another test vector t is basically achieved by generating a new test vector t such that DET(t ) = DET(t ) ∪ f , where DET(t) is the set of faults detected by t. Multiple Target Faults Test Generation (MTFTG) is used for this purpose. In MTFTG, a test vector is to be found for a set of target faults. MTFTG will fail if there exists at least two independent faults in the set of target faults. Two faults are independent if they cannot be detected by a single test vector.
The run time of an FP-based static compaction procedure can be greatly improved by considering only essential faults. A fault is defined to be an essential fault of a test vector t if it is detected only by t. The set of essential faults of t is denoted by ESS(t). It should be pointed out that whenever a test vector t is eliminated, for every fault belonging to the set DET(t) − ESS(t), the number of test vector detecting it is reduced by one.
Few FP-based static compaction algorithms have been reported in the literature. Generally, they fall into two categories. In the first category, a test vector is modified such that it detects the new additional faults. The test vector already detects its essential faults. Therefore, the test generation time for the essential faults is eliminated. Examples of such static compaction algorithms can be found in Hamzaoglu and Patel [1998] , Chang and Lin [1995] , Reddy et al. [1992] , and Hamzaoglu and Patel [2000] . On the other hand, in the second category, a set of N test vectors is replaced by a set of M < N new test vectors. The basic idea is to determine the faults that are detected only by one or more test vectors among the N test vectors to be replaced and find M < N test vectors that detect all theses faults. Examples of such static compaction algorithms can be found in Kajihara et al. [1995 Kajihara et al. [ , 1994 .
TEST VECTOR DECOMPOSITION-BASED STATIC COMPACTION ALGORITHMS
3.1 Independent Fault Clustering 3.1.1 Preliminaries. Independent faults were defined in Akers and Krishnamurthy [1989] . Basically, given a combinational circuit, let T i be the set of all possible test vectors that detect f i and T j be the set of all possible test vectors that detect f j . Then, two faults f i and f j are independent if and only if T i ∩ T j = φ. Independence among faults can also be defined with respect to a test set T . Let T i be the set of test vectors in T that detect f i and T j be the set of test vectors in T that detect f j . Then, two faults f i and f j are independent with respect to T if and only if T i ∩ T j = φ. In this paper, we use the term independent faults to mean independent faults with respect to a test set.
A fault set is called an Independent Fault Set (IFS) if all the faults in this set are pairwise independent. The problem of computing a maximum size IFS is NP-Hard [Krishnamurthy and Akers 1984] . Therefore, only maximal IFSs can be practically computed. Heuristic methods for computing IFSs were described in Akers and Joseph [1987] , Akers and Krishnamurthy [1989] , Tromp [1991] , and Pomeranz and Reddy [1992] .
IFSs were used in Akers and Joseph [1987] , Tromp [1991] , Pomeranz et al. [1993] , Kajihara et al. [1994 Kajihara et al. [ , 1995 , Chang and Lin [1995] , Wang and Stabler [1995] , and Patel [1998, 2000] . The importance of independent faults is threefold . First, they provide a lower bound on the size of the minimum test set, thus making it possible to estimate the success of test pattern generators in generating small test sets. Second, independent faults provide a method for ordering target faults for test generation. Ordering has been shown to be important for obtaining small test sets and reducing test generation time [Pomeranz et al. 1993] . Third, the use of independent faults improves the efficiency of static compaction algorithms based on essential fault pruning.
3.1.2 Algorithm Description. In Independent Fault Clustering (IFC) algorithms, IFSs are first derived. Then, a fault matching procedure is used to find sets of compatible faults, that is, faults that can be detected by a single test vector. In the IFS derivation phase, independent faults are identified with respect to a test set. On the other hand, in the fault matching phase, compatible components, corresponding to compatible faults, are mapped to the same compatibility set. Whenever a component is mapped to a compatibility set, it is merged with the partial test vector of that compatibility set. At the end, every compatibility set represents a single test vector.
Our IFC algorithm is shown in Figure 4 and proceeds as follows: First, the given test set T is fault simulated without fault dropping. This step is performed to find the number and set of test vectors that detect every fault. Second, essential faults are matched. In this step, for every essential fault f detected by t, the atomic component c f corresponding to f is extracted from t. Then, for every compatibility set CS i , if c f is compatible with the partial test vector in CS i , c f is mapped to CS i . On the other hand, if the number of compatibility sets is zero or c f is incompatible with all partial test vectors in the existing compatibility sets, a new compatibility set is created and c f is mapped to it.
It should be observed that an essential fault has a single component while nonessential faults have more than one. Therefore, if a component of a nonessential fault f is incompatible with all the partial test vectors in the existing compatibility sets, the other components of f will be tried before creating a new compatibility set. On the other hand, if the component of an essential fault is incompatible with all the partial test vectors in the existing compatibility sets, a new compatibility set must be created. Hence, essential faults should be matched first. Another advantage of first matching essential faults is that the number of faults that will be considered when deriving IFSs is reduced.
After essential faults are matched, IFSs are derived. Faults in an IFS are pairwise independent. Therefore, a fault f i can be added to an IFS S if and only if for every fault f j in S, the intersection of the sets of test vectors that detect f i and f j is empty. Next, IFSs are sorted in decreasing order of their sizes and for every fault in an IFS, the set of test vectors that detect the fault is sorted in decreasing order of the number of faults they detect. This is because a component that is extracted from a test vector that detects a large number of faults has high compatibility since it is compatible with all the components of the faults detected by that test vector.
Next, for every fault f in an IFS, its atomic component is extracted and then mapped to an appropriate compatibility set. For every component of a fault f , if it is incompatible with all partial test vectors in the existing compatibility sets, a new component will be tried. A new compatibility set is created if the number of compatibility sets is zero or all components of a fault f are incompatible with all partial test vectors in the existing compatibility sets. At the end, the algorithm returns the number of compatibility sets as the size of the new test set.
3.1.3 Illustrative Example. Table II shows an example of six test vectors and the faults they detect along with the components required for detecting the faults. The superscript e attached to some faults indicates that the faults 
are essential. As can be seen from the table, the six vectors cannot be merged together as there is at least one bit in conflict between each vector pair. Thus, the test vector merging method cannot compact these test vectors. Table III illustrates applying the IFC algorithm on the test vectors in Table II . The first three columns show the clusters created after mapping the components of essential faults. After essential faults are mapped, IFSs are created and then their faults are mapped. There are two IFSs, namely IFS 1 = { f 1 , f 9 } and IFS 2 = { f 2 }. Columns four and five show the clusters after mapping the components of faults in the IFSs. Finally, the last column shows the compacted test vectors after merging the components in each cluster. Since the number of clusters obtained is four, the compacted test set is of size four. Hence, two test vectors were eliminated.
3.1.4 Iterative IFC. The level of compaction achievable by our IFC algorithm can be improved in two ways. First, after a component is generated for a fault, the component is fault simulated and the faults detected by it are marked as detected. In this way, a large portion of the faults will not be considered subsequently since they are already detected. Based on our experimental investigations, we noticed that this extra step increases the runtime and improves the results very little. Secondly, the IFC algorithm can be called on a test set iteratively. Basically, the new test set generated is treated as the test set to be compacted. Therefore, IFC is carried out iteratively until the length of the test set cannot be reduced any more. This process is called iterative IFC and is shown in Figure 5 . Unspecified bits in the test set T are assigned random values before every call to the IFC algorithm. It should be pointed out that any static compaction algorithm can be used after our IFC algorithm. In fact, given a test set T , the IFC algorithm will generate a new test set T * whose characteristics are different from the characteristics of T . Thus, a static compaction algorithm that cannot compact T may manage to compact T * . The CBC algorithm is shown in Figure 6 and proceeds as follows. First, the given test set is fault simulated without fault dropping. This step is performed to find the number and set of test vectors that detect every fault. Secondly, test vectors are sorted in increasing order of their number of faults. Then, atomic components of test vectors are generated. Component generation is performed such that components are extracted from essential test vectors. An essential test vector is a test vector that detects at least one essential fault. The component generation algorithm is shown in Figure 7 and proceeds as follows. For every fault f detected by t, if the number of test vectors that detect f is one, that is, f is an essential fault, the component of f is extracted from t; otherwise, the number of test vectors that detect f is reduced by one. Therefore, a test vector that detects no essential faults is eliminated. The sorting step preceding component generation improves the number of eliminated test vectors. Note that a component of a fault is extracted from a test vector that detects a large number of faults.
Class-Based Clustering
After obtaining the set of components of every test vector, test vectors are sorted in decreasing order of their number of components. This helps maximize Figure 7) . 4. Sort test vectors in decreasing order of their number of components. 5. Remove redundant components using fault dropping simulation. 6. For every test vector, merge its components together. 7. Classify test vectors. 8. Process class zero test vectors (see Figure 8) . 9. For every test vector, merge its components together. 10. Reclassify test vectors. 11. Process class one test vectors (see Figure 9) . 12. For every test vector, merge its components together. 13. Reclassify test vectors. 14. Process class i test vectors, where i > 1 (see Figure 11 ). the number of redundant components. Redundant components are dropped using fault simulation with dropping. After that, every test vector is reconstructed by merging its components together. Then, test vectors are classified and processed. Class zero test vectors are processed as shown in Figure 8 . First, test vectors are sorted in increasing order of their number of components. This way a test vector with a small number of components has a higher chance of getting eliminated. After that, for every test vector, its blockage value is computed. The blockage value of a test vector t, denoted by TVB(t), can be defined as the sum of the blockage values of the individual components making up t. This can be shown mathematically as follows:
where CB(c i ) is the blockage value of component c i belonging to the set of components of t and NumComp is the number of components making up t.
CB(c i ) is mathematically defined as follows: Components of a test vector whose blockage value is zero can be moved without blocking any class zero test vector. Therefore, for any class zero test vector whose blockage value is zero, its components are moved to appropriate test vectors and then it is eliminated. A component c i is moved to a test vector t j in S comp (c i ) such that CB(c i , t j ) = 0. If there is more than one test vector, a test vector with the smallest number of components is selected. This is based on the assumption that a test vector with a small number of components has a smaller probability of conflicts with other components. The blockage values of the other class zero test vectors must be updated after merging the components of a class zero test vector. Note that the blockage value of a class zero test vector t needs to be updated if t has at least one component c i whose S comp has been modified or t receives new components. Besides, the blockage value needs to be updated if t has at least one component c i in conflict with another component c j such that S comp (c j ) has been modified and S comp (c j ) = 1.
Next, remaining class zero test vectors, having non-zero blockage value, are sorted in increasing order of their number of components. A remaining test vector t can be eliminated if for every component c i in t, S comp (c i ) = φ. A component is heuristically moved to a test vector with the smallest number of components. S comp of every component must be updated after eliminating every test vector.
It is worth mentioning that the technique we use for computing the blockage value of a class zero test vector is not exact. Consider for example the two test vectors shown in Table IV . Both vectors are in class zero. Suppose that c 21 Figure 10 ). Before moving a remaining component, test vectors in its S comp are sorted in decreasing order of their degree of hardness. This is to avoid increasing the number of components of test vectors having lower degrees of hardness since they have better chances of getting 
shows the test vectors and their components after eliminating v 3 . We next eliminate v 2 by moving the components of f 1 and f 5 to v 5 and the component of f 4 to v 4 . Table VIII shows the test vectors and their components after eliminating v 2 . At this stage, none of the test vectors can be eliminated. So, the resulting compacted test set is obtained by merging the components in each test vector. The final test set is of size three and is {111xxxx000, 1001xx11x1, 00x01x0x10}. 
0xx0xxxxxx {}
Worst-Case Analysis
We analyze here the worst-case storage and runtime requirements of our algorithms. In the analysis, we assume that the test set, fault list, and circuit structure are given as inputs. Therefore, their memory and time requirements are not considered. Throughout the analysis, the number of test vectors in a test set will be denoted by N T , size of a test vector will be denoted by N PI , and the number of faults and gates in a circuit will be denoted by N F and N G , respectively. (N F N T ) . Hence, the CBC algorithm has the space complexity O (N F (N T + N PI ) ). The time complexity of the IFC algorithm is analyzed as follows: In Step 1, the cost of fault simulation without fault dropping is O (N F N T N G ) . In Step 2, the cost of finding essential faults is O(N F ) . Besides, the costs of extracting a single component and mapping it are O(N G ) and O(N T N PI ), respectively. Therefore, the overall complexity of Step 2 is O (N F (N T N PI + N G ) ). In
Step 3, the cost of computing IFSs is O(N 2 F N 2 T ). In Steps 4 and 5, the costs of sorting IFSs and test vectors detecting every fault are O (N F log 2 N F ) and O(N F N T log 2 N T ), respectively. In Step 6, the complexity of component extraction is O (N F N T N G ) . This is because a component is extracted a number of times O(N T ) if it is incompatible with existing compatibility sets. However, from our experimental results the average number of times a component is extracted for a fault is one. The complexity of mapping components of remaining faults to existing compatibility sets is O(N F N 2 T N PI ). Therefore, the overall complexity of Step 6 is
Based on our experimental analysis of the different phases of IFC (see Table XI ), we noticed that most of the runtime of IFC is spent in computing the IFSs and matching the remaining faults. Hence, Steps 3 and 6 are the dominating sources of time consumption.
The time complexity of the CBC algorithm is computed as follows. In
Step 1, the cost of fault simulation without fault dropping is O (N F N T N 
G ). The cost of sorting test vectors in Step 2 is O(N T log 2 N T ). In Step 3, the cost of component generation is O(N F N G ). The cost of sorting test vectors in Step 4 is O(N T log 2 N T ). In
Step 5, the cost of dropping redundant components using fault simulation with dropping is O(N Based on our experimental analysis of the class zero algorithm (see Table XV ), we noticed that most of the runtime of the algorithm is spent in computing class zero test vector blockage values and updating S comp and blockage values of components. Hence, Steps 2, 3.3, and 3.4 are the dominating sources of time consumption.
After processing class zero test vectors, components of test vectors are merged and test vectors are reclassified. The costs of merging components and reclassifying test vectors are O(N F ) and O(N F N T N PI ), respectively. After that, class one test vectors are processed as shown in Figure 9 . In Step 1, the complexity of finding class one test vectors is O(N T ). Besides, the complexity of computing 
EXPERIMENTAL RESULTS
In order to demonstrate the effectiveness of the IFC and CBC algorithms, we have performed experiments on a number of the ISCAS85 and full-scanned versions of ISCAS89 benchmark circuits. The experiments were run on a SUN Ultra60 (UltraSparc II-450 MHz) with a RAM of 512 MB. We have used test sets generated by HITEC [Niermann and Patel 1991] . In addition, we have used the fault simulator HOPE [Lee and Ha 1996] for fault simulation purposes and the test relaxation algorithm in [El-Maleh and Al-Suwaiyan 2002] for component generation. Table IX summarizes the features of benchmark circuits we have used in our experiments. The first column gives the circuit name. Columns two through eight give the number of primary inputs, number of primary outputs, number of gates, number of Test Vectors (TVs), number of Collapsed Faults (CFs), number of Detected Faults (DFs), and Fault Coverage (FC), respectively.
In Table X , we report the results of applying the Random Merging (RM), Graph Coloring (GC), and IFC algorithms on the test sets after they are compacted by ROF. The first column gives the circuit name. The second and third columns give test set sizes after applying ROF and RM, respectively. Columns four through six give the results of the GC algorithm. The number of components obtained after dropping redundant ones is given under the column headed #Comp. Test set sizes are given under the column headed #TVs. The total time required by the GC algorithm is given under the column headed Total. Columns seven to eight give the results of the IFC algorithm. Test set sizes are given under the column headed #TVs. The total time required by the IFC algorithm is given under the column headed Total. The GC algorithm is called the Brelaz Color-Degree algorithm and is explained in Mchugh [1990] . It proceeds as follows. First, an incompatibility graph is built. In this graph, nodes correspond to components and an edge exists between two nodes if their corresponding components are incompatible. Secondly, as long as the number of uncolored nodes is not zero, a node n * is selected such that it has the maximum number of adjacent nodes. Now, n * is colored with the current color c k . Then, for every node n i that is compatible with n * and can be colored with c k , it is colored with c k . After that, the incompatibility graph is updated. As can be seen from Table X , for most of the circuits, the GC algorithm is able to compute test sets whose sizes are smaller than the sizes of the test sets obtained by RM. This observation reveals the potential of the TVD technique. Test sets computed by the GC algorithm are as much as 11.9% smaller than those computed by RM, for example, 1% smaller for c2670, 9.5% smaller for s38584f, and 11.9% smaller for s4863f.
It can be seen that the results obtained by the IFC algorithm are better than those obtained by the RM and GC algorithms. The percentage improvement over the RM algorithm varies between 3% for s208.1f and 37.5% for s38584f. On the other hand, the percentage improvement over the GC algorithm varies between 1.4% for s3384f and 31% for s38584f. The runtime of the IFC algorithm is better than that of the GC algorithm.
In Table XI , we provide a detailed analysis of the IFC algorithm. The first column gives the circuit name. The second and third columns give the number of essential faults in the test set and the number of compatibility sets created after matching essential faults, respectively. The fourth and fifth columns give the number of independent fault sets and the maximum size of an independent fault set, respectively. The sixth column gives the average number of test vectors that detect a fault. The seventh and eighth columns give the average and maximum number of components generated per fault during the process of fault matching. Columns nine to twelve indicate the time taken by the different phases of the IFC algorithm. Column nine gives the time taken by fault simulation without dropping. Column ten gives the time taken for matching essential faults. Column eleven gives the time taken for building independent fault sets. Finally, column twelve gives the time taken for matching remaining faults.
The following observations can be made from the information in Table XI . First, an average of five essential faults are mapped to a compatibility set. Second, the average number of components generated for a fault is one. This indicates that on average, a component is mapped successfully to a compatibility set from the first trial. Third, the most time consuming phases in the IFC algorithm are the phases of building the independent fault sets and the phase of matching the nonessential faults.
Our implementation of building the independent fault sets has a complexity of O(N 2 F N 2 T ). However, a more efficient implementation can be achieved by finding pairwise independent faults and then solving a clique partitioning problem. Finding pairwise independent faults can be implemented efficiently using appropriate data structures. This will be investigated in future work.
The step of matching nonessential faults is time consuming mainly due to the generation of components. This step can be speeded up by reducing the number of components that need to be generated. This can be achieved by fault simulating the test vectors resulting from matching essential faults and dropping the detected non-essential faults. This will also be investigated in future work.
For large circuits with large number of faults, fault simulation without dropping can be also time consuming. The speed of fault simulation without dropping can be improved by employing the X-algorithm [Akers et al. 1990 ]. The X-algorithm, based on logic simulation and value justification, can significantly reduce the number of faults that need to be injected. Furthermore, double detection fault simulation can be used to speed up fault simulation without dropping. The impact of double detection on the quality of the compacted test sets will be investigated in future work.
Critical Path Tracing (CPT) Abramovici et al. [1984 Abramovici et al. [ , 1990 can also be used to speed up fault simulation without dropping. CPT deals with faults implicitly. Therefore, fault simulation, fault collapsing, fault partitioning, fault insertion, and fault dropping are not needed. Furthermore, although CPT is an approximate method, it was experimentally shown in Abramovici et al. [1984] that the impact of approximation is negligible. CPT can be implemented to be as fast as concurrent fault simulation [Abramovici et al. 1990] .
In Table XII , we give the results of applying the iterative IFC algorithm on test sets first compacted by ROF. The first column gives the circuit name. The second column gives the test set sizes after running IFC for one iteration. The third column gives the test set sizes after applying IFC iteratively until no improvement is noticed. The fourth column gives the number of iterations that were run. Finally, the fifth column gives the time taken by the iterative IFC algorithm.
It can be seen that Iter IFC improves over both RM and IFC. The percentage improvement over RM varies from 3% to 46.6%, for example, 3% for s208.1f, 35.8% for s38417f, and 46.6% for s38584f. On the other hand, the percentage improvement over IFC varies from 1.6% to 17.2%, for example, 1.6% for s3271f, 14.5% for s38584f, and 17.2% for s38417f.
In Table XIII Table XIII and  information in Table XIV . For circuits c3540 and s208.1f, the size of class zero is zero. This indicates that every test vector has at least one CC. However, for the circuit c3540, although it does not have class zero test vectors, some improvement is noticed after processing class zero test vectors. This is because some test vectors are eliminated in the component generation phase since they do not detect essential faults. Another interesting observation is that not all class zero test vectors can be eliminated. This is because while processing class zero test vectors, the S comp of some components will become empty which makes their parent test vectors become nonclass zero test vectors. In addition, S comp may contain only class zero test vectors.
It is also observed that although the size of class one is large, the number of class one potential test vectors is very small. In fact, the number of class one potential test vectors is zero for most of the circuits. In general, if the size of class i, where i > 0, is greater than zero and the number of class i potential test vectors is zero, this indicates that every class i test vector has at least one CC whose S cand is empty. It should also be observed that not all potential test vectors can be eliminated. This is because potential test vectors can be damaged. A potential test vector is said to be damaged if the S cand of one or more of its CCs become empty. In addition, a potential test vector is damaged if one or more of its components become CCs and/or if it receives one or more CCs from other potential test vectors.
As can be seen from the results in Table XIII , the CBC algorithm reduces the test sets by as much as 34%, for example, 2.5% for c3540, 23.5% for s38417f, and 34% for s38584f. It should be observed that the improvements achieved after processing class one are very small. This is due to the reasons explained above.
In Table XV , we give a detailed analysis of some phases of the CBC algorithm. The first column gives the circuit name. Column two gives the time taken by fault simulation without dropping. Columns three and four give the time taken for generating components and dropping redundant ones, respectively. Columns five and six give the time taken for reconstructing and reclassifying test vectors, respectively. Column seven gives the time taken for computing the initial blockage values for all class zero test vectors (see Step 2 in Figure 8 ). It should be pointed out that the computation of test vector blockage is part of the phase of processing class zero test vectors. Finally, columns eight and nine give the time taken for processing class zero and class one test vectors, respectively.
As can be seen from the table, most of the runtime of the CBC algorithm is spent in the component generation, component elimination, and blockage value computation phases. Component generation can be speeded up by first generating the components for essential faults and fault simulating them to drop all detected non-essential faults. Hence, the number of components that need to be generated for remaining faults will be reduced. Furthermore, the time requirement of the component elimination phase is reduced since less components are generated. Other techniques for speeding up the component elimination phase will be investigated in future work.
Another interesting observation that can be seen from the table is that our current implementation of the blockage value computation phase is time consuming. More efficient techniques for computing test vector blockage and other heuristics will be investigated in future work. Table XVI shows the results of applying the CBC algorithm on test sets compacted by ROF+IFC and ROF+Iter IFC. The unspecified entries in test are matched together. Two independent faults can be mapped to the same compatibility set if their components are compatible. On the other hand, in CBC, classes of test vectors are formed and then test vectors are processed in increasing order of their degree of hardness. At the end, every test vector represents a cluster whose components originally belong to test vectors in different classes. Experimental results are reported to demonstrate the effectiveness of the two algorithms. In general, the IFC algorithm has achieved an improvement of as much as 37.5% over random merging. Besides, the iterative version of IFC has achieved an improvement of as much as 46.6% over random merging and 17.2% over IFC. Furthermore, the CBC algorithm has achieved a test set reduction of as much as 34%.
In the future, we will investigate the impact of the critical path tracing and double detection algorithms on the quality of compacted test sets. Besides, we will consider reducing the complexity of non-essential fault matching in the IFC algorithm by fault simulating test vectors resulting from essential fault matching and then dropping the detected non-essential faults. Furthermore, we will consider improving the current implementation of building the independent fault sets. Finally, we will consider improving the time consuming phases in CBC.
