Abstract-Various designs of scan paths based on tree-like structures have recently been suggested for reducing test application time or test data volume in today's high density VLSI circuits. However, these techniques strongly rely on the existence of a large number of compatible sets of flip-flops under the given test set, and therefore, are unsuitable for highly compact test sets generated by efficient ATPG tools. In this paper, to circumvent this problem, a new two-pass hybrid method is proposed to design an efficient scan tree architecture. Given a compact test set, compatibility relationships among the flip-flops are first explored, and a graph-based heuristic algorithm is employed to construct a scan tree with minimal incompatibility. Next, the same combinational ATPG tool is rerun to generate a new test set satisfying the logical constraints on the secondary inputs imposed by the structure of the scan tree. To cover the remaining hard-to-detect faults, if any, a few test vectors are chosen from the original test set for application in the serial mode. Experimental results on various benchmark circuits demonstrate that the proposed algorithm outperforms the earlier methods in reducing the total test application time significantly without any degradation of fault coverage.
I. Introduction
The controllability and observability of a digital circuit can be increased by various well known designfor-testability (DfT) techniques. Among them, the most popular one is the widely used serial full scan methodology, which transforms a sequential circuit to its combinational parts in test mode. Although the full-scan method reduces the cost of test generation and provides high fault coverage, the inherent serial nature of the scan path increases the test application time and energy consumption in test mode significantly. The number of clock cycles needed to scan in/out the test data is equal to the product of the number of test patterns and the length of the scan chain. Hence, the test application time (also the test data volume) can be reduced either by reducing the number of test patterns, or by reducing the scan chain length, as used in tree-based scanpath architectures. A fewer number of test patterns may however, reduce the fault coverage. On the other hand, a compact test set with high fault coverage has relatively smaller number of don't cares in the test patterns, which increases the incompatibity relationships among the flip-flops. This in turn, causes the design of scan-tree inefficient, i.e., results in a larger depth of the tree. Thus, the above two key factors *The charges of two extra pages will be paid if the paper is selected.
being conflicting in nature, make the scan tree design problem hard to solve.
There are many existing methods that are used to tackle the problem of reducing the test application time. One approach is to configure the scan elements into multiple scan chains [1] , [2] . But this increases the number of scan-in and scan-out pins needed during test. Hybrid test generation techniques are introduced in [3] , [4] . But, these techniques are computationally expensive and are not suitable for large sequential circuits. A scan-based BIST is introduced in [5] , where a scan chain is partitioned into multiple segments by inserting XOR gates between two adjacent scan segments. Test responses of each scan segment can be observed through an XOR tree. However, this technique do not address the problem of testing embedded cores used in SoC design [6] . The Illinois Scan Architecture (ILS) was recently proposed to circumvent this problem [7] , [8] , [9] . The ILS is shown to be useful both for the standalone chip and when used in embedded cores. It does not require any additional test pin other than the ones used in full scan.
In the ILS scan, several branches of the scan paths emanate from the scan-in pin of the circuit, which may cause overloading. An alternative approach, called scan tree has been proposed [10] , [11] , [12] , [13] to reduce the test application time or test data volume. In such a scan architecture, the structure resembles a tree, where one cell drives other scan cells as in a tree. Like ILS, a scan tree does not require any additional test pin other than the ones used in full scan. The same test data bit must arrive in the scan cells, which have an equidistant common ancestor cell. Thus, in a scan tree architecture, all the bits in each of the test vectors that are to be loaded in the scan cells (flip-flops) lying at the same depth of the tree, must be compatible (i.e., non-conflicting) among themselves. If the above condition is satisfied, the corresponding scan cells are also called compatible. The effectiveness of the scan tree depends on the correlation among the test data of the different scan cells. The test application time in the case of a scan tree depends on the number of test patterns and the depth (length of the longest path in the tree) of the scan tree. The latter strongly depends on the compatibility of the flip-flops under the given test set. An increase the compatibility among the flipflops is likely to result in a decrease in the scan tree depth. Most of the previous works for generating the scan tree use a non-compact test patterns (containing many don't cares), in order to obtain a large number of compatible relationships. In [10] , the don't care values of the test patterns are changed appropriately to generate the scan tree. In these cases, the number of test patterns considered is usually large (compared to those generated by an optimized ATPG tool achieving the same fault coverage). Hence, these methods fail to yield a scan tree of substantial depth for a highly optimal compact test set, and cannot exploit the advantage of using such a test set of small volume.
In this paper, a new graph-based two-pass algorithm is proposed to design the scan tree architecture. The present algorithm is particularly applicable for a compact test set, where the compatibility among the flip-flops is usually low. First, the compatibility relationships among the flip-flops are explored for a given test set, and a scan tree architecture is designed with minimal incompatibility. Next, the same ATPG tool is run again to generate a new test set satisfying the logical constraints on the secondary inputs imposed by the scan tree designed above. These vectors will be applied in the tree mode. To cover the remaining hard-to-detect faults, if any, a few test vectors are chosen from the original test set for application in the serial mode. Experiments on various benchmarks reveal very encouraging results.
The rest of the paper is organized as follows. In Section II prior works on scan tree designs are reviewed. In Section III the proposed algorithm is explained and in Section IV, a technique to cover hardto-detect faults is described. Section V presents some experimental results on benchmark circuits. Section VI concludes the paper.
II. Scan Tree Architecture
Several tree-based scan architectures have been proposed [11] , [12] earlier to reduce the test application time. Fig. 1 presents an example of a single scan chain consisting of 6 scan cells. The test vectors are shown in Fig. 1 . It is evident that in the three test vectors the scan cells FF1 and FF6 are compatible. Similarly, FF3, FF4, and FF5 are mutually compatible. Based on this information, all the flip-flops are grouped to form the scan tree. Thus the groups of compatible scan cells are: {F F 2}, {F F 1, F F 6}, {F F 3, F F 4, F F 5}. As FF2 is incompatible with all other scan cells, it cannot be grouped with other scan cells. The scan cells in the same group will receive the same test data bits.
To determine the compatible scan cells from the test set, an incompatibility graph [11] is constructed from the test set. In the incompatibility graph, a vertex corresponds to a scan cell, and an undirected edge between the two vertices exists if and only if the two scan cells are incompatible i.e., if the corresponding column vectors (see Fig. 1 ) have conflicting bits (0 and 1) for at least one test pattern.
In Fig. 1 there are 6 vertices in the incompatibility graph. The grouping of scan cells to construct the tree architecture is determined following the chromatic partitioning of the incompatibility graph.
As mentioned earlier that a scan-tree architecture will be meaningful only when a large number of flipflops are compatible. However, for a compact test set the probability of having compatible flip-flops reduces and the situation becomes further worse when the number of flip-flops is large. One solution to the problem is to choose an arbitrary scan tree structure a priori, and then use a constrained ATPG to generate the test patterns for the circuit. The disadvantage of the method is that the number of test patterns will be large as well as the fault coverage will be low. Another solution (known as hybrid method) is to select a subset of the test set so that a scan tree of reasonable depth is constructed and to apply the remaining vectors in serial mode [11] . However, this does not fare well for a highly compact test set. The present work solves this problem by adopting a two-pass hybrid method.
III. Proposed Algorithm for Scan Tree Design
In this section, theoretical formulation of the method is first presented followed by the proposed algorithm.
A. Scan tree organization
We start with a given compact test set of the original circuit, and analyze the compatibility relationships among the flip-flops.
For example, consider the test set shown in Fig. 2 , which consists of four test vectors. The scan chain has five flip-flops; hence scan test vectors are five-bit long. According to the above description, flip-flop 1, denoted as F F 1, corresponds to bit 1 in each scan test vector, flip-flop 2, denoted as F F 2, corresponds to bit 2 in each scan test vector, and so on (Fig. 2a) . In the present example, the incompatibility graph is a complete graph, and hence the scan tree will reduce to a serial chain.
In the proposed work, the incompatibility distance(denoted as d)of two flip-flops, which is defined as the number of conflicting bits in the two corresponding column vectors, is calculated. For the example in Fig. 2 , we have the following distance values: 
A complete undirected graph G(V, E), called weighted incompatibility graph (WIG), is then constructed with the flip-flop as vertices, and these distance values as weights on the corresponding edges. A zero weight denotes a compatible pair. It may be noted that in Fig. 2 , there is no pair of compatible flip-flops.
A w−distance WIG is a maximal subgraph of WIG G(V, E) such that all the edges in the present in the subgraph have weight w. We start with the subgraph with the smallest w value, and continue to process subgraphs with higher w values onward, during the construction of the scan tree. The following steps illustrate the process.
1. Draw the maximal subgraph G (V , E ) of G(V, E) by extracting all the edges E of weight w from G and let V , represent the corresponding vertices connected by E . 2. Obtain the complement of the graph G . Given a undirected graph G = (V , E ), the complement of G is defined as G = (V , E ), where E = {(u , v ) : u , v ∈ V , u = v , and (u , v ) ∈ E }. 3. In G , determine the minimum number of colors needed to color the graph. Based on this chromatic partition, group the vertices of G , i.e. the corresponding flip-flops into nearly-compatible groups. Thus, the flip-flops grouped in this fashion will have incompatibility distance w among themselves. Based on these groups a scan tree is constructed. For example, the graph G(V, E) shown in Fig. 2 can be used to find the groups of flip-flops, when the the incompatibility distance is 1. For this purpose, a sub graph G (V , Fig. 3a) . The complement of the graph G is constructed as shown in Fig. 3b . In G (V , E ) (complement of G ), |V | = 4 and |E | = 3. The vertices (flip-flops) are grouped via coloring process of the graph. The graph, shown in Fig. 3b has no edge between F F 1 and F F 2. So, they are grouped. Similarly, {F F 4, F F 5} are grouped. The scan tree thus obtained is shown in Fig. 4 . The scan tree of Fig. 4 is formed where the bit difference between the flipflops in each group is allowed to be 1. The process is repeated for the other higher values of w until the WIG is exhausted. In this way, a scan tree is built where the flip-flops lying at the same depth of the tree have the same incompatibility distance values among themselves. 
B. Algorithm to generate the scan tree
In this section, an algorithm is developed to design the scan tree architecture when the compatibility relations among the flip-flops are low. The algorithm is divided into two procedures, Procedure 1 is used to generate the scan structure and Procedure 2 is used to determine the fault coverage. The Procedure 1, first generates the WIG G(V, E) from the given test set. Next, Procedure 1 finds the groups of flip-flops for constructing the scan tree. An outline of the Procedure 1 is shown in Fig. 5 .
Once we obtain a scan tree structure, we rerun the same ATPG tool to generate a new set of test vectors with the logical constraints imposed on the secondary inputs of the circuit by the grouping of scan cells. The rationale behind this is as follows: since the same ATPG tool is being run on the same circuit-under-test (CUT) with some input constraints determined by minimal incompatibility, fault coverage close to the earlier one can be achieved also in the second run. However, in the presence of the constraints, some detectable faults in the original circuit may become untestable or hard-to-test in the scantree mode. Thus, to achieve the same fault coverage obtained by the original test vectors, some of them, appropriately chosen from the original set, can be applied to the CUT in serial scan mode as in [11] by using a simple scheme that allows to reconfigure the tree architecture in the serial mode if needed. Procedure 2 described below, is used to determine the undetectable faults in the CUT with the scan tree, generated by the Procedure 1. It uses test generation and fault simulation to find the undetectable faults. Procedure 2 first finds the complete fault list for the circuit, by fault collapsing. Let the complete fault list is denoted by f c . The constraints of the scan tree are then imposed on the secondary inputs to the CUT. Next, the ATPG is run to generate a set of patterns (T p ) for the circuit. Incremental fault simulation is performed while generating the test patterns in T p with the target fault list f c . Let f d represent the set of currently detectable faults. For each pattern, it finds out the faults that are detected. The detected faults are stored in f d . The process continued until all the test vectors in T p are simulated, and the set of undetectable faults and/or time-aborted hard-to-detect (HTD) faults f u , is determined. The outline of the Procedure 2 is presented in Fig. 6 .
IV. Hybridization Using the Serial Scan Mode
The faults, which were detectable by the original test set but have become untestable or HTD (f u ) under the scan tree mode, can now be detected again by dynamically reconfiguring the tree into the serial mode, and by applying a few requisite test patterns. Thus, the first mode is the normal scan tree mode (ST mode) and second one is serial scan mode (SS mode). The idea is to apply the major part of the test vectors in ST mode and the remaining part in SS mode. Fig. 7 illustrates the technique. The switching from ST mode to SS mode is done by a controller, which consists of some logic and a counter that counts the number of test patterns to be applied in the ST mode. Fig. 8 shows the design of the controller when the number of patterns in ST mode is 7.
The set of undetectable or HTD faults (f u ), obtained from Procedure 2 can be made detectable by using serial scan mode. The Procedure 3 is used to generate the such test patterns for the circuit. The inputs to the Procedure 3 are the set of undetectable faults (f u ) and the original test patterns (T n ) from which scan tree structure was built. The minimal number of test patterns for the serial scan mode can be obtained by constructing a bipartite graph, whose left set of vertices represent the vectors in the test set T n and right set of vertices represents the faults in the list f u . As shown in Fig. 9a , T n = {t 1 , t 2 , t 3 , t 4 , t 5 , t 6 } denote the test patterns set and f u = {f 1 , f 2 , f 3 , f 4 } denote the set of target faults. The testability relation among the tests and detected faults can be described by a bipartite graph G(T n , f u ) [14] , [15] E i ) ) of G which contains the edges E i and all the vertices V i connected by E i . 5. Draw the Complement G i of the subgraph G i . 6. Color G i to find the groups of flip-flops. Consider only those groups which have more than one flip-flop. 7. From G, delete all the vertices those are already grouped, the edges incident on them and the edges having weight i. 8. Set i = i + 1. 9. If more than one edge of same weight is still present in G, go to step 3. where T n and f u represent the two disjoint set of nodes. An edge (t, f ) is given if and only if a test t detects a fault f . A fault f is said to be detected by a test set T n if G contains atleast one edge between the set T n and f .
In order to draw the edges, a complete fault simulation is done with each patterns of T n and the remaining fault set f u , which are undetectable or HTD in the scan tree mode. An illustrative example is shown in Fig. 9a .
After the graph G(T n , f u ) is drawn, the problem is to find out a subset T s of T n such that 1. T s cover all nodes in f u 2. The number of nodes in T s is minimum
In the present work, a greedy approach is considered to solve the above optimization problem. The proposed algorithm depends on the degree of each node in the sets f u and T n . Based on the degree of the nodes on the fault side, the faults can be categorized into two classes. They are hard-to-detect (HTD) faults which are the nodes of smaller degree, and easy-to-detect (ETD) faults i.e., nodes having a larger degree. In Fig. 9a , the fault f 4 is a HTD fault compared to others. In order to find the minimum number of test patterns, the HTD faults are considered first and correspondingly the test vectors that detect them and have the highest fault coverage are selected. The detected faults (nodes) are removed from the graph and this process is repeated. The algorithm is described below. Step-1; Step-4: Consider the node of highest degree in f u . Mark the patterns connected to the node. Include the test pattern of highest degree to T s ;
Step-5: Delete the test pattern included to T s in
Step-4, the faults detected by that test pattern and the edges between them;
Step-6: if (f u ==NULL) { T s is the minimum test pattern set; } else go to
Step-3;
The above algorithm can be explained with help of Fig. 9a . In Fig. 9a , fault f 4 is considered as it is a HTD fault. The corresponding test pattern t 4 can detect f 2 and f 4 . The pattern t 4 is included into the set T s . The nodes f 4 , f 2 , t 4 and the edges (t 4 , f 4 ), (t 4 , f 2 ), and (t 6 , f 2 ) are deleted from G. The graph thus obtained is shown in Fig. 9b. In Fig. 9b , there is no HTD fault. Between the nodes f 1 (of degree 2) and f 3 (of degree 3), node f 3 is considered as its degree is greater than f 1 . But, f 3 can be detected by three test patterns, t 3 , t 5 , and t 6 . The degree of t 3 , t 5 , and t 6 are 2, 1, and 1 respectively. Among them t 3 is considered because it is the node with the highest degree. The test t 3 is added to the T s . The nodes t 3 , f 1 , f 3 and edges (t 3 , f 1 ), (t 3 , f 3 ) are deleted from G. After that no nodes are left in the fault set f u . Thus, the set T s = {t 3 , t 4 } gives the minimal number of test patterns that can detect all the undetectable faults f u .
The serial patterns thus obtained (T s ) cover all the faults that are untestable in scan tree structure. Next, a fault simulation is done with test patterns T s and the complete fault list f c . This is done in order to drop additional faults that are covered by these serial patterns. The detectable faults are stored in a set f ds (i.e., faults detected in serial mode). The fault set f ds is then removed from f c . The remaining set denoted by f dst , is actually the target fault list which have to be detected in the scan tree mode. The test generation is done with fault list f dst to generate the test patterns (T st ) in the scan tree mode. The combined set of test patterns (T ) is finally obtained by the union of T s and T st . The outline of the Procedure 3 is presented in Fig. 10 .
V. Experimental Results
The proposed algorithms are implemented in C language on a SUN SPARC ULTRA-60 workstation in SOLARIS 5.8 environment and applied to various ISCAS'89 benchmark circuits. The test patterns used are provided by the TetraMax tool from Synopsys.
The experimental results for the full scan circuits are presented in TABLE I. The last two columns in this table present the number of test patterns, and the number of cycles necessary to test the core respectively. The test application time for a full scan in serial mode is calculated as n f + (n f + 1) × T n , where n f is the number of flip-flops and T n is the number of test patterns.
The experimental results for ISCAS'89 circuits using the proposed algorithm is shown in TABLE II. The columns in the table presents the circuit name,
