As opposed to scan schemes, a non-scan DFT allows at-speed 
Introduction
To achieve a good design-for-testability (DFT) technique, the designers must have following goals -1) to decrease test generation time, 2) to decrease test application time, 3) to have high fault efficiency * , 4) to achieve at-speed testing, and 5) hardware overhead should also be less in the designs. One approach in DFT designs is scan technique. In full scan technique [1, 2] , the test generation problem of a sequential circuit is reduced to that of a combinational one and use of combinational ATPG guarantees complete fault efficiency. Partial scan [3, 4] offers low hardware overhead than full scan, but as it uses sequential test generation methods, high fault efficiency cannot be achieved. However, scan techniques fail to provide at-speed testing. To avoid the problems of scan techniques, non-scan approaches are proposed in [6] [7] [8] [9] . In [6] , some flipflops are controlled by multiplexers. In [7] , DFTs are * The ratio of number of faults detected or proved redundant by a test algorithm to the total number of faults in a circuit is known as fault efficiency. designed using locally available lines. In [8] , nonscan design was targeted only to remove equivalent and isomorph redundancy. In all these approaches, though testing time may be improved, complete fault efficiency cannot be achieved. Non-scan DFT approach with complete fault efficiency using combinational ATPG is first proposed in [9] . This paper suggests three new non-scan DFT techniques for sequential circuits. In the proposed techniques, test sequences for different faults in a sequential machine are found by generating test patterns by a combinational ATPG tool used on combinational part of the machine and use of such ATPG tool guarantees complete fault efficiency. As each test pattern generated by this ATPG tool consists of values on primary inputs as well as state registers, some test patterns may consist some values that can never be reached by state transitions from reset states. To reach such values on state registers (invalid states), we propose a technique to append an extra logic called circuit to reach invalid states (CRIS) with the original machine. Among the three techniques, the first one requires k additional observable points (k is the number of flip-flops in the circuit). Use of one more additional circuit called as Differentiating Logic (DL) greatly reduces the number of additional observable points in second and third techniques. The DL part of the proposed additional hardware is universal (i.e., independent of the original machine). To increase the testability for PLA-based machines, the work in [11] appends an additional hardware. However, hardware overhead in [11] is higher and it depends on the original machine. Three proposed techniques, the method in [9] and full scan technique are compared on benchmarks. First two techniques have low hardware overhead. First and third techniques have low test application time. Test application time of the second technique is larger in comparison to those of first and third, but it is less than that of full scan. Test length and hardware overhead are found to compare favorably with those of scan and previous non-scan approaches.
Preliminaries
The general model of a synchronous sequential machine is shown in Fig. 1 
New DFT designs
From STG of the machine, first we find the set of valid and invalid states. Then we use combinational ATPG algorithm to find the set of test vectors of the CTGM. If such a test vector is a valid test state, then this state can be reached from the reset state. But if it is an invalid test state (i.e, it is unreachable from the reset state), the value of PPIs cannot be set to the SR using state transitions of the machine. The problem of state initialization to an unreachable state poses a major problem in the test generation of sequential circuits. In our designs, we adopt a new technique to reach these unreachable states. Notice that to test a circuit, we need not reach all invalid states, reaching only to invalid test states are sufficient.
The first technique
In our design to set the invalid test states to the SR, we append an extra logic called as CRIS (circuit to reach invalid states) to the original machine that generates all invalid test states of the machine. The DFT scheme is shown in Fig. 3 . CRIS has the inputs as the next state lines of the original machine. In a similar approach recently [9] , an additional circuit was also used to reach these invalid test states, where primary inputs are used as inputs to the extra logic. 
The second technique
The drawback of the first technique is that it requires k additional observable points. To reduce the number of observable points, we use one more additional circuit known as differentiating logic (DL). The complete scheme is shown in Fig. 5 .
(a) Design of (DL): Two cases need to be considered.
Case 1: k < n: In this case, DL has one output, given by F =x 1 y 1 +x 1 y 1 +x 2 y 2 +x 2 y 2 +……+x k y k + x k y k . The circuit to realize F is shown in Fig. 6 . The function F has a unique property. For every combination of (y 1 , y 2 ,…, y k ), y i ∈ (0,1), the subfunction contains a unique pattern in x i s, such that for a pattern (y 1 , y 2 ,…, y k ) at PPIs, if we apply a pattern X at PIs with (x 1 , x 2 ,…, x k ) = (y 1 , y 2 ,…, y k ), we get the output of DL as 0, and for any other pattern at PI the output is at logic 1. It implies that if the machine reaches a state S i (y 1 , y 2 ,…, y k ), then by applying a single input pattern, obtained by complementing each bit of (y 1 , y 2 ,…, y k ), this state can be uniquely identified. That is, differentiating sequence of any two states is of unit length. Case 2: k > n: DL has r (= k/n) outputs, and each output line realizes F i (1< i <r) s.t., F j+1 = x 1 y jn+1 +x 1 y jn+1 +x 2 y jn+2 +x 2 y jn+2 +……+ x a y jn+a + x a y jn+a where a = n for (0<j<r-1), and a=k-(r-1)n for j=r-1. If a is found to be 1, then we replace F j+1 by y jn+1 .
(b) DL is universal:
Design of DL is dependent only on the number of PIs and flip-flops in the circuit, i.e., it is universal, not dependent on the circuit structure. Thus, any fault in DL does not interfere with the original circuit behavior.
(c) Use of Hold mode:
It is used to identify a state. If a state (y 1 ,y 2 ,…,y k ) is expected at present state lines, we activate hold mode and apply an input for case 1 (input sequence for case2) at PIs such that x i = y i ∀ i (1<i<n). If the output of DL is 0, then the state of the machine is identified as the expected state.
(d) Techniques to achieve low test application time:
As differentiating sequence is of length r = k/n, test application time is greatly reduced, which is n in case of full scan. (f) Hardware overhead: It equals to (2k+r) gates, where is less than that of full scan for r < n-1.
The third technique
Drawback of second technique is that as observable points use present state lines, we cannot use the same justification sequence for different faults having same initialization state. To avoid this, the third technique is proposed, where a register R is used to load the values of the next state lines and outputs of R are fed into DL. The complete scheme is shown in Fig. 8 . Use of hold mode is similar to that of first technique.
Experimental Results
General performance of the DFT Design can be described as in Table 1 . Rows "scan", "ATS-98", "case1", "case2" and "case3" represent full scan, the method in [9] , technique-1, technique-2 and technique-3 respectively. O(ISG) and O(CRIS) indicate the overhead of invalid state generator (ISG) in the paper of [9] and that of CRIS of this paper respectively. It is found experimentally that O(CRIS) < O(ISG). O(CRIS) was found to be maximum of two two-input gates in MCNC benchmarks. The value c denotes the number of control inputs needed for CRIS and r equals to k/n, where n and k are the number of PIs and flip-flops in the machine respectively. In most cases of benchmarks, r is found to be 1 and c is 1, except in two cases, where it is found to be 2. 3  dk15  3  5  4  2  dk16  2  3  27  5  dk17  2  3  8  3  ex1  9  19  20  5  ex2  2  2  19  5  ex3  2  2  10  4  ex4  6  9  14  4  ex5  2  2  9  4  ex6  5  8  8  3  ex7  2  2  10  4  keyb  7  2  19  5  kirkman  12  6  16  4  lion  2  1  4  2  lion9  2  1  9  4  mc  3  5  4  2  opus  5  6  10  4  planet  7  19  48  6  planet1  7  19  48  6  pma  8  8  24  5  s1  8  6  20  5  s1488  8  19  48  6  s1494  8  19  48  6  s208  11  2  18  5  s27  4  1  6  3  s298  3  6  218  8  s386  7  7  13  4  s420  19  2  18  5  s510  19  7  47  6  s820  18  19  25  5  s832  18  19  25  5  sand  11  9  32  5  sse  7  7  16  4  styr  9  10  30  5  tav  4  4  4  2  tbk  6  3  32  5  tma  7  6  20  5  train11  2  1  11  4  train4  2  1  4  2 Experimental results on benchmarks are also shown. Benchmark specifications are shown in Table 2 . AutoLogic II (Mentor Graphics) tool synthesizes the circuits from MCNC benchmarks [10] . Columns "name", "#PIs", "#POs", "#states", "#FFs" denote the name, the number PIs, POs, states, and flip-flops of the original sequential machines respectively. In benchmark results, we show only those cases when number of inputs (n) > 1. For n=1, we apply only the first technique of our DFT designs. 0  5  1 9  3  3  3  2  2  opus  12  12  1  10  38  3  6  6  3 3 p l a n e t 1 8  1 8  1  1 4  5 6  3  8  8  3  3  planet1  18  18  1  14  56  3  8  8  3  3  p m a  1 5  1 5  1  1 2  4 7  3  7  7  3  3  s1  15  15  1  12  47  3  7  7  3  3  s1488  18  18  1  14  56  3  8  8  3  3  s1494  18  18  1  14  56  3  8  8  3  3  s208  15  15  1  12  47  3  7  7  3  3  s 2 7  9  9  1  8  2 9  3  5  5  3  3  s298  24  165  1  20  76  3  10  10  5  5  s386  12  12  1  10  38  3  6  6  3  3  s420  15  15  1  12  47  3  7  7  3  3  s510  18  18  1  14  56  3  8  8  3  3  s820  15  15  1  12  47  3  7  7  3  3  s832  15  15  1  12  47  3  7  7  3  3  sand  15  0  0  11  46  3  6  6  2  2  s s e  1 2  1 2  1  1 0  3 8  3  6  6  3  3  styr  15  15  1  12  47  3  7  7  3  3  t a v  6  0  0  5  1 9  3  3  3  2  2  t b k  1 5  0  0  1 1  4 6  3  6  6  2  2  t m a  1 5  1 5  1  1 2  4 7  3  7  7  3  3  train11  12  16  1  11  39  3  6  6  4  4  t r a i n 4  6  0  0  5  1 9  3  3  3  2  2   Table 3 shows hardware and pin overhead. Hardware overhead of first technique is lowest and significantly small. Hardware overhead of both first and second technique is smaller than that of full scan. The third technique needs more hardware as an additional register of k flip-flops (k= # of flip-flops) are used. We have considered 7 gates per flip-flop in third technique. In the techniques 2 & 3, number of gates are decreased by 3 from that given in the formula of Table 1 , if the remainder in dividing k by n be 1. Pin overhead of proposed second and third techniques are same and in most cases it equals to that of full scan technique which is always 3. The first technique, requires more number of pins and it is same as that in the method of [9] . Test generation and application time for different methods are shown in 
Conclusions
The paper suggests three new techniques on non-scan DFT. As state initialization is a major problem in testing of sequential circuits, it solves that problem by using an additional hardware called as CRIS (circuit to reach invalid states). It is found experimentally that hardware overhead of CRIS is also low. 
