This paper proposes a scan design for delay fault testability of dual circuits. In normal operation mode, each proposed scan flip flop operates as a master-slave flip flop. In test mode, the proposed scan design performs scan operation using two scan paths, namely master scan path and slave scan path. The master scan path consists of master latches and the slave scan path consists of slave latches. In the proposed scan design, arbitrary two-patterns can be set to flip flops of dual circuits. Therefore, it achieves complete fault coverage for robust and non-robust testable delay fault testing. It requires no extra latch unlike enhanced scan design. Thus the area overhead is low. The evaluation shows the test application time of the proposed scan design is 58.0% of that of the enhanced scan design, and the area overhead of the proposed scan design is 13.0% lower than that of the enhanced scan design. In addition, in testing of single circuits, it achieves complete fault coverage of robust and non-robust testable delay fault testing. It requires smaller test data volume than the enhanced scan design in testing of single circuits. key words: dual circuits, master and slave scan paths, delay fault testing, concurrent error detection, DFT
Introduction
As technology advances into the deep-submicron regime, designs are becoming increasingly more sensitive to various noise sources [1] , [2] . Excessive noise can cause performance degradation and signal integrity problems. It can corrupt the system-level data integrity. It can also significantly affect the timing performance.
Concurrent error detection techniques with dual circuits are expected to ensure data integrity of today's deep submicron devices. Fault detection in such designs can be done by comparing output responses of dual circuits when identical input sequences are applied to them. (In this paper, single circuits mean conventional circuits without redundant circuits for fault detection.) Many works related to this have been presented. Mitra et al. analyzed the design diversity metric and reliability of concurrent error detection quantitatively [3] . They also presented the combinational logic synthesis for diversity of concurrent error detection [4] . Recently, [5] presented a soft error masking technique using a duplicated circuit. Pomeranz et al. analyzed the nature of fault recovery of concurrent-online testing [6] , and online transition fault testing using identical circuits [7] . The au- thor's group proposed concurrent online testing approaches using an embedded reconfigurable core [8] , [9] . The amount of transistors on dual circuits is at least twice as large as that of a single circuit. This has a bad influence on test costs. Therefore, test cost reduction of dual circuits is more important than that of single circuit. It is now widely accepted that stuck-at fault testing can no longer satisfactorily test the functionality of fabricated integrated circuits in nanometer technologies. Unfortunately, traditional functional at-speed testing suffers from huge amount of test development costs, and limited effectiveness. Furthermore, limited test access to internal registers makes application of at-speed functional tests impractical. Therefore, scan based structural delay fault testing, which can significantly improve the controllability and observability, appears to be the practical approach for the delay fault testing. Some previous works related to the scan based delay fault testing are shown as follows. Broad-side testing, skewed-load testing, and enhanced scan testing are well-known scan based delay fault testing techniques [10] , [11] . The broad-side and skewed-load testing use the standard scan design, and thus the area overheads for those methods are not high. However fault coverage is low because those methods permit the application of only strongly limited test patterns to circuit under test (CUT). The enhanced scan design achieves complete fault coverage [10] . However, the scan flip flop for the scan design requires additional redundant latches. This additional latches give bad influence on the area overhead. A scan design with the flip flops for delay fault testing without extra latches was proposed [12] . The important difference between the scan flip flop for the scan design and the standard scan flip flop is the manner of connecting to the adjacent scan flip flops. This scan design is almost equivalent to the enhanced scan design from the viewpoint of fault coverage, test application time, and required memory size for ATE in spite of having the same area overhead as standard scan design. However, the testability of dual circuits is not considered in these scan designs.
The dual circuits consumes additional area at least twice of the original circuit. Therefore, chip designers decide which cores or modules should be duplicated considering the critical parts and the limited area in general. Accordingly, scan design applied to the chip is desired to be able to test both dual circuits and single circuits. This paper proposes a scan design for delay fault testability of dual circuits. In test mode, the proposed scan design performs scan operation using scan chains with two scan Copyright c 2009 The Institute of Electronics, Information and Communication Engineers paths, namely master scan path and slave scan path. This scan architecture increases the delay fault testability of dual circuits. In testing of single circuits, it achieves complete fault coverage of robust and non-robust testable delay fault testing. In addition, it requires smaller test data volume than the enhanced scan design in testing of single circuits.
The rest of this paper is organized as follows. Section 2 presents the detail of the proposed scan design. Section 3 evaluates the effectiveness of the proposed scan design. Section 4 shows how to apply the proposed scan design to single circuits. Finally, Sect. 5 concludes this paper.
Scan Design for Delay Fault Testability of Dual Circuits
This section explains the proposed scan design. First, Sect. 2.1 describes an overview of the proposed scan design. Section 2.2 explains the architecture of the proposed flip flop and the scan design. Section 2.3 explains operation modes of the proposed scan design. Section 2.4 shows the delay fault testing using the proposed scan design. Figure 1 shows dual circuits for concurrent error detection using the proposed scan design. It consists of two identical combinational circuits C i (i = 0, 1), two sets of scan flip flops, FF 00 , FF 10 , · · · , FF 0i , FF 1i , · · · , FF 0(n−1) , FF 1(n−1) , and an error checker E. The combinational circuit C i includes l primary inputs,
Overview
, n state inputs, y i0 , · · · , y i(n−1) , and n state outputs, z i0 , · · · , z i(n−1) . The error checker circuit E has inputs from z i j and f i j , and an output line Err. In this figure, the input lines from f i j are left out for the convenience of space. The output Err is activated if a fault occurs.
Architecture of Proposed Flip Flop
The proposed scan flip flop FF i j consists of a master latch L Mi j and a slave latch L Si j . These latches are both positive latches, which capture input values when clock signal is 1, and keep the captured values when clock signal is 0. Figure 2 illustrates the detail of the proposed scan design. In this figure, 2n proposed flip flops, FF 00 ,
, are arranged coincident with ones in Fig. 1 . As shown with the two dotted lines in the figure, two scan paths, called master scan path and slave scan path, consisting of only master latches and of only slave latches are constructed, respectively. The master (slave) scan path is with a scan input MSCI (SSCI) and a scan output MSCO (SSCO), and 2 to 1 selectors. The inputs, D 00 ,
, D 1(n−1) are connected from y 00 , y 10 , · · · , y 0i , y 1i , · · · , y 0(n−1) , y 1(n−1) in Fig. 1 , and the outputs, Q 00 , Q 10 , · · · , Q 0i , Q 1i , · · · , Q 0(n−1) , Q 1(n−1) are connected to z 00 , z 10 , · · · , z 0i , z 1i , · · · , z 0(n−1) , z 1(n−1) . One bit control signal SE controls the two 2 to 1 selectors. The four clocks, CLK M0 , CLK S0 , CLK M1 , CLK S1 , control L M0i , L S0i L M1i , L S1i , respectively. Note that these clocks are independent each other.
Operations in Proposed Scan Design
The proposed scan design has two operation modes, normal operation mode and test mode.
In normal operation mode, SE = 0, and each flip flop FF i j operates as a master-slave flip flop. The clocks for master latches, CLK M0 and CLK M1 , are provided the same signal, while the clocks for slave latches, CLK S0 and CLK S1 , are provided the inverted signal of the clocks for master latches.
In test mode, SE = 1, and the proposed scan design performs scan operation using two scan paths, namely master scan path and slave scan path. Each pair of {L M0i , L M1i }, {L S0i , L S1i } (0 ≤ i ≤ n − 1) plays the role of the materslave flip flop for scan operation during test mode. Thus, the clock control of CLK M0 and CLK M1 , which control L M0i and L M1i , respectively, should be complementary. For same reason, the clock control of CLK S0 and CLK S1 should be complementary. Because of these clock controls, the value stored in a closed latch is equal to the one stored in the next opened latch. In the proposed scan design, testing using the proposed scan design, scan operations of master and slave scan path are performed simultaneously to reduce scan-in and scan-out time to the half.
Delay Fault Testing in Proposed Scan Design
In this subsection, delay fault testing with the proposed scan design is explained. First, we explain the sequence from the scan-in operation of a test vector to the scan-out operation of the test response. The test sequence is divided into the following four steps.
Step 1 The initial vector and transition vector of a twopatterns are scanned-in into slave scan path from SSCI and into master scan path from MSCI, respectively, after SE is set to test mode.
Step 2 Transitions are launched into the circuit under test,
after SE is set to normal operation mode. One clock later, test responses are captured into the corresponding master latches.
Step 3 The test responses captured into L M1i are transferred to the corresponding slave latches, L S1i .
Step 4 Finally, SE is set to test mode again. After that, test responses are retrieved from MSCO and SSCO. Figure 3 shows the timing chart when n = 2 in Fig. 2 . In Step 4, scan-out operations of master scan path and slave scan path are performed simultaneously. For the parallel scan-out operation, just only the test responses captured into L M1i are transferred to the corresponding slave latches, L S1i in Step 3. Therefore, as shown in the timing chart, CLK S1 is open, while CLK S0 is close during the first 0.5 cycles of
Step 3. To make the phase of master scan path same as that of slave scan path for the scan-out operation in Step 4, CLK M1 is opened, and other clocks are closed at the last 0.5 clock cycle of Step 3. When the scan length is 2n, Step 1 and Step 4 require n clocks, while Step 2 requires constant 2 clock, and Step 3 requires constant 1.5 clock. If these steps are executed sequentially, the test application time is T c = N tp (2n + 3.5), where N tp is the number of test patterns. However, Step 1 for the next test pattern and Step 4 can be executed simultaneously. Therefore, the test application time T c is formulated as follows.
In
Step 1 and Step 4, scan operations of both master and slave scan paths are performed simultaneously. However, because the clock controls of these scan paths are independent each other, high accuracy of the clock synchronization On the other hand, high accuracy of the clock synchronization of CLK M0 with CLK M1 for the scan operations of the master scan path, and the one of CLK S0 with CLK S1 for the scan operations of the slave scan paths should be guaranteed. In addition, in Step 2, Step 3, and normal operation mode, high accuracy of the clock synchronization of CLK M0 with CLK S0 , and that of CLK M1 with CLK S1 should be guaranteed.
Evaluation
The proposed scan design is evaluated in the following subsections. In the evaluation, the proposed scan design is compared with the three conventional scan designs: the standard scan design, the enhanced scan design, and the Chiba scan design [12] . The Chiba scan flip flop has similar architecture to the proposed scan flip flop. The difference is the number of scan paths. The Chiba scan design has only the master scan path in test mode, while the proposed scan design has the master scan path and the slave scan path. Section 3.1 evaluates the test application time of the proposed scan design. Section 3.2 evaluates the area of the proposed flip flop and the circuits with the proposed scan design. Section 3.3 evaluates the clock accuracy for the scan operation.
Test Application Time
In the evaluation, the parameter of increase ratio of test application time, TATO, is introduced. It is calculated by the following formula:
(TAT when applied the evaluated scan design)
(TAT when applied ES) ,
where TAT is test application time, and ES is the enhanced scan design. Delay fault test data sets for robust testable delay fault is generated by the ATPG implemented by C language. In this evaluation, the checker circuit E is not included in the circuit under test. The evaluation is performed with common number of scan channels. The number of scan channels is two in this evaluation. The routing of the scan chains of the proposed scan design is shown in Fig. 2 . The evaluation results of the standard scan design and the Chiba scan design depend on the routing of the scan chains. The evaluated circuits applied the Chiba scan design have one scan path routed via FF 00 , FF 10 , · · · , FF 0(n/2−1) , FF 1(n/2−1) and the other scan path routed via FF 0n/2 , FF 1n/2 , · · · , FF 0(n−1) , FF 1(n−1) sequentially. This routing permits to apply arbitrary two-patterns like the proposed scan design. The evaluated circuits applied the standard scan designs have one scan chain routed via FF 00 , · · · , FF 0(n−1) and the other scan path routed via FF 10 , · · · , FF 1(n−1) sequentially. The routing permits to test each circuit of dual circuits as a single circuit using both of broad-side testing and skewed-load testing method. On the other hand, the evaluation results of the enhanced scan design does not depend on the routing of the scan chains. Table 1 shows the evaluation result on ISCAS89 benchmarks. The "# of RTPDF" column shows the number of robust testable paths. To get the number of robust testable paths, each path of the path list is checked whether it is robust sensitizable or not using the ATPG algorithm with infinite backtrack limit before test generation.
The TAT and TATO columns show the test application time and the increase ratio of the test application time, respectively. The FC column shows the fault coverage. The subcolumns, SS, ES, CS, PS, show the result of the standard scan design, the enhanced scan design, the Chiba scan design, and the proposed scan design, respectively. The test application time of the proposed scan design is lower than that of the enhanced scan design. It is because both the first pattern and second pattern can be scanned-in simultaneously in the proposed scan design. From the evaluation result, it is 58.0% of that of the enhanced scan design on average. On the other hand, because the test response stored in either even numbered flip flops or odd numbered flip flops are retrieved in the Chiba scan design, the test application time of the Chiba scan design is larger than that of the enhanced scan design on average. From the evaluation result, it is 112.4% of that of the enhanced scan design on average. The test application time of the standard scan design is 43.0% of the enhanced scan design, which is 15.0% smaller than that of the proposed scan design.
Arbitrary two-patterns can be applied to each of dual circuits using the enhanced scan design, the Chiba scan design, and the proposed scan design. Thus, the test generations for these scan designs are made under the same restrictions that arbitrary two-patterns can be applied to all flip flops. Therefore, under the assumption of using common ATPG algorithm, test data sets for the three scan designs are the same. On the other hand, because the second pattern of the standard scan based delay fault testing methods depends on the first pattern, arbitrary two-patterns cannot be applied to each of dual circuits using it.
Therefore, the enhanced scan design, the Chiba scan design, and the proposed scan design achieve complete fault coverage, while the standard scan design does not. In this evaluation, the fault coverage of the enhanced scan design, the Chiba scan design, and the proposed scan design are 100.0%, while the average value of the fault coverage of the standard scan design is 78.7%. The area overhead of the scan design is calculated by following formula:
(Area of dual circuits with the evaluated scan design) (Area of dual circuits with no scan design) − 1.
Note that in this formula, the area of circuits does not include the error checker circuit E. Each flip flop is implemented by the standard cells of Rohm 0.35 µm 2 design rule. Table 2 shows the evaluation result of the area overhead of the flip flops. The AR and AO columns show the area of each flip flop and the area overhead, respectively. The area overhead of the standard scan flip flop, the Chiba scan flip flop, the proposed scan flip flop, and the enhanced scan flip flop are 25.0%, 18.0%, 42.9%, and 64.4%, respectively. Therefore, the area overhead of the proposed scan flip flop is 17.9% larger than the standard scan flip flop, 24.9% larger than the Chiba scan flip flop, and 21.5% smaller than the enhanced scan flip flop. Table 3 shows the evaluation result of the area overhead of the scan designs on ISCAS89 benchmarks. The ES, CS, PS columns show the evaluation result of the enhanced scan design, the Chiba scan design, and the proposed scan design, respectively. Each evaluation result column is divided into two subcolumns. The AR subcolumn shows the area with the evaluated flip flops. The AO subcolumn shows the area overhead. The area overhead of the Chiba scan design, the proposed scan design, and the enhanced scan design are 11.7%, 27.6%, and 40.6% on average, respectively. Therefore, the area overhead of the proposed scan flip flop is 15.9% larger than the one of the Chiba scan flip flop, and 13.0% smaller than the one of the enhanced scan flip flop.
Clock Accuracy
As described in Sect. 2.4, the proposed scan design requires high accuracy of the clock synchronization for the scan operation of each scan path. Here, for the quantitative estimation of the required clock accuracy, we calculate the clock margins of the master scan path and the slave scan path. In the proposed scan design, two clock lines control the scan operation for a scan path. One controls the master latches of the scan path. The other controls the slave ones. For example, in case of the master scan path of Fig. 2 , CLK M0 controls the master latches, and CLK M1 controls the slave latches. Theoretically, no relative delay between CLK M0 and CLK M1 occurs, but in fact the relative delay between them does. In this evaluation, we analyze how much re-lateive delay is permitted for the correct scan operation. We define the relative delay ΔT as the following formula:
where T CLK M is the propagation delay of the clock for the master latches, and T CLK S is that for the slave latches. We assume that the propagation delay of the clock for the master latches is fixed in this evaluation. For the evaluation we define the clock margin mrg as the following formula:
where ΔT max is the maximum relative delay of the slave clocks for normal operation, and ΔT min is the minimum relative delay. If the relative delay is between ΔT max and ΔT min , the scan path works, otherwise does not.
For the calculation, we measure ΔT max and ΔT min with SPICE simulation. We measure those of the scan paths comprised of six proposed flip flops. The technology used in this evaluation is Rohm 0.35 µm. For the measurement, repetitive scan-in data of 4-bit sequence "1000" are shifted-in from the scan input continuously. We search ΔT max and ΔT min by sweeping the relative delay controlling the slave clock with 0.02 ns width and monitoring the output value of each latch of the measured scan path. Table 4 shows the result. The column freq. shows the clock frequency. The columns "master scan path" and "slave scan path" show the evaluation results of each scan path. They have sub-columns ΔT max , ΔT min , and margin. The column margin shows the clock margin calculated by Eq. (3). The result shows that in all the evaluated clock frequencies ΔT max s of the master scan path and the slave scan path are 0.16 ns and 0.14 ns, respectively. In all the evaluated clock frequencies, ΔT min s of master scan path and slave scan path are constant value, −0.16 ns and −0.14 ns, respectively. Therefore, in any evaluated clock frequency, the clock margin of master scan path and slave scan path are 0.32 ns and 0.28 ns, respectively. The maximum delay of an inverter is 0.041 ns, and thus the permitted relative delays of the master scan path and the slave scan path are equal to about that of seven inverters and that of six inverters, respectively. The scan paths must be routed according to the clock margins. Figure 4 shows the wave forms of a SPICE simulation of the master scan operation when the relative delays are around ΔT max or ΔT min . The clock frequency is 100 MHz. The relative delay of the wave forms (a) and (b) are ΔT min −0.16 ns and −0.18 ns, respectively. On the other hand, the relative delay of wave forms (c) and (d) are ΔT max 0.16 ns and 0.18 ns, respectively. Each wave form represents data signal. The lines, L M10 , L M11 L M12 are the output wave forms of the master latches. These latches work as the slave latches during the scan operation. The line MSCI is the wave form of the master scan input.
Both the wave forms (a) and (c) are normal. The scanin data 1 is shifted from L M10 to L M12 correctly. The horizontal three arrows show the pulse width of the wave forms of L M10 , L M11 L M12 . You can observe that the pulse widths of these wave forms are equal to the clock width, and they 
Delay Fault Test Sequence for Single Circuits
This section explains the delay fault test sequence and evaluation in case that the proposed scan design with two scan paths is applied to single circuits, that is, conventional single circuits, not dual circuits. Delay fault test sequence from scan-in operation of a test vector to scan-out operation of the test response is as follows.
Step 1 The odd-numbered bits of an initial vector are scanned-in into slave scan path from SSCI, and the even-numbered bits of the initial vector are scanned-in into master scan path from MSCI simultaneously, after SE is set to test mode.
Step 2 The even-numbered bits of the initial vector stored in master latches are transferred into the corresponding slave latches, after SE is set to normal operation mode.
Step 3 A transition vector is scanned-in from MSCI, after SE is set to test mode again. When the proposed scan design is applied into dual circuits, each flip flops can launch arbitrary transitions. However, when the proposed scan design is applied into single circuits, only either even-numbered flip flops FF 2i or oddnumbered flip flops FF 2i+1 can launch transitions into the circuit under test at a time. However, under the proposed scan design, arbitrary one bit transition can be launched at least. Thus complete fault coverage of robust and non-robust testable fault is guaranteed [13] , [14] . Figure 6 shows the timing chart from Step 1 to Step 6, when n = 4.
We evaluate the number of test pattern, test data volume, test application time, and fault coverage when the proposed scan design is applied into single circuits. In this evaluation, the number of test patterns, and test data volume are evaluated using the parameter of the increase ratio of the number of test patterns and the test data volume, respectively. The increase ratio of the number of test patterns, TPO, is calculated by the following formula:
(# of TP when applied the evaluated scan design) (# of TP when applied ES) ,
where TP is test patterns, and ES is the enhanced scan design.
The increase ratio of the test data volume, TDVO, is calculated by the following formula:
(TDV when applied the evaluated scan design) (TDV when applied ES) ,
where TDV is the test data volume, and ES is the enhanced scan design. As the previous evaluation, the proposed scan design is compared with the Chiba scan design and the enhanced scan design. Table 5 shows the evaluation result of the number of test patterns and test data volume on ISCAS89 benchmarks. The first column shows each benchmark circuit name. The second column shows the number of test patterns. The third column TPO shows the increase of the number of test pattern. The column TDV shows the test data volume. The column TDVO shows the increase ratio of test data volume. The rows Max., Ave., Min. show the maximum, the average, the minimum TDVO in evaluated benchmark circuits, respectively.
The columns, # of test pattern, TPO, TDV, TDVO are divided into subcolumns, ES, CS, PS, respectively. The subcolumns ES, CS, PS show the results of the enhanced scan design, the Chiba scan design, and the proposed scan design, respectively.
According to the average value of TDVO, the test data volume of the proposed scan design is smaller than that of the enhanced scan design. This result shows that the proposed scan design has the effect of delay fault test data compression. One of the reasons why the proposed scan design has the effect is that the required test data volume per a test of the proposed scan design is smaller than that of the enhanced scan design. Therefore, the average value of TDVO of the proposed scan design tends to be smaller than 100.0%. The average value of TDVO of the proposed scan design is 93.4%, while that of the Chiba scan design is 108.1%. Table 6 shows the evaluation result of the test application time and the fault coverage. The TAT and TATO columns show the test application time and the increase ratio of the test application time, respectively. The FC column shows the fault coverage. The rows Max., Ave., Min. show the maximum, the average, the minimum TATO in the evaluated benchmark circuits, respectively. The increase ratio of the test application time is calculated by Eq. (2). The fault coverage is 100.0% in every scan design. The average value of TATO shows that the test application time of the proposed scan design is longer than that of the enhanced scan design and the Chiba scan design. However, the average value of TATO is within 20%.
The result of s35932 shown in Tables 5 and 6 seems to be peculiar compared with those of other circuits. On this circuit, the parameters TPO, TDVO, TATO of the proposed scan design are more than 1.5 times as large as those of the enhanced scan design, respectively. As described before, when the proposed scan design is applied into single circuits, arbitrary two-patterns cannot be applied to circuits. Only either even-numbered flip flops FF 2i or odd-numbered flip flops FF 2i+1 can launch transitions into the circuit under test at a time. From the evaluation results, the restriction gives bad influence on the test cost of s35932 compared with other evaluated circuits.
Conclusions
This paper proposes a scan design for delay fault testability of dual circuits. The most important difference between conventional scan design and the proposed scan design is scan operation using two scan paths, namely master scan path and slave scan path. In the proposed scan design, arbitrary twopatterns can be set to flip flops of the dual circuits. Therefore, it achieves complete fault coverage for robust and non- is applied, the circuits with the proposed scan design are expected to require smaller compressed data volume than those with the enhanced scan design. The evaluation is the future work.
