We present a new test resource partitioning (TRP) 
Introduction
A system-on-a-chip (SOC) integrates several intellectual property (IP) cores. These cores must be tested using precomputed test sets provided by the core vendor. However, increased design complexity leads to higher test data volume for SOCs, which in turn leads to an increase in testing time [1] . One approach to reduce test time as well as overcome memory and I/O limitations of automatic test equipment (ATE), is based on test data compression and on-chip decompression [4] [5] [6] [7] [8] . Test data compression is especially appealing for SOCs with IP cores, for which BIST techniques based on gate-level structural knowledge are not feasible.
Test data compression is an example of a test resource partitioning (TRP) scheme for handling test complexity; see Figure 1 . Test data volume and testing time are decreased by using a combination of coding techniques and faster onchip decompression of encoded test data. The compressed data can be transferred at a slower rate from the ATE to the SOC. This allows the use of low-end ATEs with less memory and slower clock rates.
Reduced pin-count test (RPCT) is a well-known technique for reducing the number of integrated circuit (IC) pins that have to be contacted by the tester [2, 3] . RPCT requires full-scan design and boundary scan access to the £ This research was supported in part by the National Science Foundation under grant number CCR-9875324. chip I/Os. It enables testing of ICs using low-cost ATEs that have fewer functional pin channels than the number of IC I/O pins. The basic idea of RPCT is that only the clock pins, test control pins, scan-in and scan-out pins, and the TDI and TDO pins of the boundary scan chain are driven directly by the tester. The remaining functional pins are accessed through the boundary scan chain. RPCT helps in testing multiple sites simultaneously i.e., an ATE can test multiple chips in parallel. This can reduce test cost substantially, especially during wafer sort, when not all functional I/O pins are contacted. However, known RPCT schemes do not directly address the problem of test data volume.
We present a new TRP-based RPCT scheme that offers a number of important advantages. It leads to reduced testing time and test data volume, and it requires a smaller number of channels connecting the ATE to the SOC. The proposed scheme, which is based on frequency-directed runlength (FDR) codes for test data compression [8] , allows us to readily overcome ATE memory and I/O bandwidth limitations. The reduction of test data was demonstrated in earlier work [8] . Here we concentrate on the test application time. We present detailed testing time analysis to demonstrate that a slower ATE can be used without impacting testing time. Such rigorous testing time analysis has not been presented earlier for test data compression schemes.
Yet another advantage of the proposed TRP-based RPCT scheme lies in the fact that it can potentially detect more defects than compacted test sets. Although compacted test sets for single stuck-at faults are widely used for testing, it is now recognized that they are not always effective for non-modeled faults and various defect types. In particular, since every modeled fault is now detected by fewer patterns, this approach can lead to reduced coverage of non-modeled faults [9, 10, 11] . It was shown in [10] that ¤ -detection test sets, in which every fault is detected by at least ¤ patterns (¤
¦ ¥
) and the average number of tests per fault is high, are more effective for detecting non-modeled faults. Since uncompacted tests are applied to the SOC after on-chip decompression, more detections per fault and therefore higher defect coverage is likely with the proposed TRP scheme.
The remainder of the paper is organized as follows. The test data compression and decompression architecture based on FDR codes is reviewed in Section 2. We also present testing time analysis and the RPCT test architecture in Section 2. In Section 3, we show that test sets used for test data compression provide more ¤ -detections and are therefore more likely to detect non-modeled faults than the uncompacted test sets. The experimental results are presented in Section 4, followed by conclusions in Section 5.
TRP using test data compression
We first review FDR coding and its application to test data compression [8] . The FDR code is a data compression code that maps variable-length runs of 0s to variablelength codewords. The FDR code is constructed as follows: The runs of 0s in the data stream are divided into groups
, where is determined by the length
). Note also that a run of length is mapped to group 7 6
group is equal to 2`i.e., ` c ontains 2`members. Each codeword consists of two parts-a group prefix and a tail. The group prefix is used to identify the group to which the run belongs and the tail is used to identify the members within the group. The encoding procedure is illustrated in Figure 2 . As an example, consider a run of five 0s (a 
required.
Since the decoder for FDR coding needs to communicate with the tester, and both the codewords and the decompressed data can be of variable length, proper synchronization must be ensured through careful design. In particular, the decoder must communicate with the tester to signal the end of a block of variable-length decompressed data. These ATE requirements and other related decompression issues are discussed in detail in [6, 4] .
Single scan chain
We first analyze the testing time when a single scan chain is fed by the FDR decoder. Test data compression decreases testing time, and allows the use of a low-cost ATE running at a lower frequency to test the core without imposing any penalties on the total testing time. Let the ATE frequency and the on-chip scan frequency be w ẍ P y e and w u ! 2
, respectively, where w x P y e w u ! 2
. Since the ATE and the scan chain operate at two different frequencies, the decoder also consists of two parts-one operating at 
¥
. The parameter should ideally be a power of 2 since it is easier to synchronize the ATE clock with the scan clock for such values of [12] . If the scan chain has multiple segments operating at different clock frequencies, each segment has a dedicated decoder for test data decompression. Figure 3 outlines the decoder partitioned into two frequency domains. The proposed TRP scheme therefore decouples the internal scan chain(s) from the ATE via the use of a decoder interface. This decoupling implies that the scan clock frequency is no longer constrained by the ATE clock frequency limitation. Thus 
be the time required to transfer the data from the ATE to the chip and to decode the codeword, respectively. An upper bound on H £ U Q can be obtained by assuming that decoding begins after the complete codeword is transfered from the ATE. This implies that
For FDR codes, the prefix length and the tail length of the codeword belonging to the V Y X group is each equal to ; see Figure 2 . Since data is transfered from the ATE to the chip at the tester frequency, the time required to transfer any codeword of the V Y X group is given by 
Run-length a ¡
is the fourth member of group ¢ ¦
. Therefore, . The total decoding time
The total time needed to decompress the codeword is given by is the size of the encoded test set. Next, to derive a lower bound on the testing time, suppose the tail bits are shifted in while the prefix is being decompressed. Since, the tail bits are now shifted in parallel while the prefix bits are decoded, a lower bound on decoding time using (1) is given by: To conclude the analysis, we note that the above bounds allow us to evaluate the testing time without a detailed analysis of the asynchronous handshaking protocol between the tester and the decoder. The exact testing time, which lies between the two bounds can be determined through a bitby-bit analysis of the encoded test data. The formulation based on upper and lower bounds allows us to demonstrate the effectiveness of the proposed TRP scheme without resorting to such detailed analysis. is incremented. When the scan chain is completely loaded, the functional clock is enabled to apply the pattern to the core.
Multiple scan chains
The test set has to be reorganized for the above test architecture before applying test data compression. Let the test pattern 
RPCT based on test data compression
RPCT is effective for designs with a small number of scan pins [2] . However, IC designs often incorporate multiple-scan chains to reduce testing time. For such designs, RPCT needs to contact all the scan pins and therefore does not provide any advantage. The enhanced reduced pincount test (E-RPCT) technique provides a solution to the above problem by loading multiple-scan chains in parallel through the boundary scan chain [3] . The boundary scan chain is divided into multiple segments and these segments are used to carry out the serial to parallel conversion while loading the scan chains. Although E-RPCT helps in reducing testing time for cores with multiple-scan chains, it does not address the problem of test data volume as it requires large tester memory for the pins to be contacted. A promising solution is to combine E-RPCT with the proposed TRP scheme.
RPCT based on test data compression is shown in Figure 6 . The external tester feeds the encoded test data d s e through the TDI and the scan-in pins.
d e is decompressed on-chip using the decoder and the test patterns are loaded into the boundary-scan chain. The boundary-scan chain is then used to load the internal scan chains in parallel. The boundary-scan chain is divided into multiple segments to shift-in the test data serially from the decoder. When the boundary-scan chain is completely loaded, data is trans- fered to the internal scan chains in parallel. Similarly, the boundary-scan chain is used to load the data to the primary inputs of the core. The test responses are captured in the internal scan chains and the boundary-scan chain and shifted out through the scan-out and TDO pins. The proposed RPCT scheme helps in reducing the test data volume since the ATE stores the small encoded test set. The testing time is reduced as multiple chips can be tested in parallel using RPCT and also because the decoding is done on-chip at a higher clock frequency. The proposed scheme enables the use of low-cost ATEs, thus bringing down the total test cost. Since the scheme uses the boundary-scan chain, which is ¢ ¡ £ ¡ £ ¡ © ¢ ¤ compliant, it can be used for carrying out various other tests.
Enhanced defect coverage
It has recently been shown that ¤ -detection test sets, in which every fault is detected by ¤ (¤ ¥ ) tests, are more effective in detecting defects that are not modeled by stuck-at faults [9, 10, 11] . In this section, we show that the uncompacted test sets that are applied to the core under test after decompression provide a higher degree of ¤ -detection than ATPG-compacted test sets. This in itself is not surprising since a large number of patterns are now being applied to the core under test. However, when viewed in the context of reduced data volume and testing time, greater ¤ -detection (and potentially higher defect coverage) emerges as an important benefit of the proposed TRP scheme. Note that there is no loss in the coverage of modeled faults due to test data compression.
In order to determine the number of times a fault is detected by an uncompacted test sets, we used the FSIM fault simulator [14] and test cubes generated using the Mintest ATPG program [13] . The test cubes were encoded using FDR coding, and the resulting decompressed patterns were applied to the larger ISCAS-89 benchmark circuits. Since random-testable faults are usually detected by a large number of patterns, we only considered random-pattern resistant faults, which were left undetected after the application , the number of detections is significantly higher for the proposed TRP scheme. Thus increased defect coverage appears to be an added benefit of test data compression and on-chip decompression scheme.
Experimental results
In this section, we present experimental results on the testing time for the TRP scheme based on FDR coding. The effectiveness of FDR coding for test data volume reduction was shown in [8] ; these results are therefore not presented here. We also determine the percentage of single stuck-at faults that are detected multiple times, both for uncompacted and compacted test sets for the large ISCAS-89 benchmark circuits. The test sets were obtained using the Mintest ATPG program, which is known to yield the most compact test sets for the benchmark circuits. Table 1 presents test application time for the proposed method and for traditional scan-based testing with w I x P y e 9 w x P y e
. We note that in all the cases the upper bound on test application time using the proposed scheme is lower than that for scan-based external testing. The actual test application time for the proposed TRP scheme lies between the lower and upper bounds. For example, the test application time for s38584 with 9
, and w S I x P y e 9 w x P y e 9 % b
MHz lies between 3.002 ms and 6.005 ms, which is lower than the time of 8.052 ms required for external testing. compacted test patterns to the core under test. We assume a single scan chain for each of the benchmark circuits, and use the analytical results of Section 2.1. The bounds on % 3 value is more likely to detect more non-modeled faults and provide high defect coverage. Therefore, TRP using test data compression/decompression not only reduces test data volume and testing time but it is also likely to provide increased defect coverage.
Conclusions
We have shown that test data compression can be used for effective test resource partitioning and reduced pincount testing. The on-chip decompression of test pattern decouples the internal scan chain(s) from the ATE, thereby allowing higher scan clock frequency. We have presented a rigorous testing time analysis for compression/decompression based on FDR codes. Experimental results for the ISCAS-89 benchmark circuits show that a slower ATE can often be used with no adverse impact on testing time. Therefore, the proposed approach not only decreases test data volume and the amount of data that must be transfered from the ATE, but it also reduces testing time and facilitates the use of less expensive ATEs. Finally, an added benefit of the proposed TRP technique is that it is likely to increase defect coverage since it increases the degree of multiple detections for modeled faults.
