We present a novel technique to reduce both test data volume and scan power dissipation using test data compression for system-on-a-chip testing. Power dissipation during test mode using ATPG-compacted test patterns is much higher than during functional mode. We show that Golomb coding of precomputed test sets leads to signi cant savings in peak and average power, without requiring either a slower scan clock or blocking logic in the scan cells. We also improve upon prior work on Golomb coding by showing that a separate cyclical scan register is not necessary for pattern decompression. Experimental results for the larger ISCAS 89 benchmarks show that reduced t e s t data volume and low power scan testing can indeed b e achieved i n a l l c ases.
INTRODUCTION
Pre-designed intellectual property (IP) cores are now c o mmonly used in large system-on-a-chip (SOC) designs. However, IP cores pose several di cult test challenges. Two problems that are becoming increasingly important a r e p o wer consumption during manufacturing test and test data volume. The precomputed test patterns provided by the core vendor must be applied to each core within the power constraints of the SOC. In addition, test data compression is necessary to overcome the limitations of the automatic test equipment ( A TE), e.g. tester data memory and I/O channel capacity.
Power consumption during testing is important since excessive heat dissipation can damage the circuit under test. Since power consumption in test mode is higher than during normal operation, special care must be taken to ensure that the power rating of the SOC is not exceeded during test This research w as supported in part by the National Science Foundation under grant n umber CCR-9875324.
application 1]. A number of techniques to control power consumption in test mode have been presented in the literature. These include test scheduling algorithms under power constraints 2], low-power built-in self-test (BIST) 3, 4] , and techniques for minimizing power during scan testing 5, 6, 7] . Power consumption is especially important for SOCs since test scheduling techniques for system integration attempt to reduce testing time by applying scan/BIST vectors to several cores simultaneously 8, 9] . Therefore, it is extremely important t o d e c r e a s e p o wer consumption while testing the IP cores in an SOC.
Test data volume is another problem faced in SOC test integration. One way to alleviate this problem is to use BIST. However, BIST can only be applied to SOCs if the IP cores in them are BIST-ready. Since most currentlyavailable IP cores are not BIST-ready, the incorporation of BIST in them requires considerable redesign. Hence test data compression techniques that facilitate low-power scan testing are desirable for SOC testing.
The con icting goals of low-power scan testing and reduced test data volume appear to be irreconcilable. Test generation for low-power scan testing usually leads to an increase in the numberoftestvectors 5]. On the other hand, static compaction of scan vectors causes signi cant increase in power consumption during testing 7]. The compacted vectors are rendered useless if they exceed power constraints. Clearly, uncompacted vectors cannot be used since they require excessive tester memory. This problem is addressed in a recent paper on power-constrained static compaction of scan vectors 7]. However, while 7] provides 2-3 times reduction in power consumption for several ISCAS benchmark circuits, it does not lead to any appreciable reduction in test data volume|in fact, it does not provide any improvement over standard static vector compaction techniques.
Recently, a n umber of data compression techniques have been proposed for reducing SOC test data volume 10, 11] . In this approach, the precomputed test set TD provided by the core vendor is compressed (encoded) to a much smaller test set TE and stored in ATE memory. An on-chip decoder is used for pattern decompression to generate TD from TE during pattern application. It was shown in 10, 11] that compressing a \di erence vector" sequence T dif f determined from TD results in smaller test sets and reduces testing time. An obvious drawback of this approach is that it requires a separate cyclical scan register (CSR).
In this paper, we dispel the notion that scan vector compaction always leads to higher power consumption. Since static compaction invariably leads to higher power, we ex- Figure 1 : An example to illustrate the procedure of deriving fully-speci ed ordered TD. plore test data compression for overcoming this problem. We s h o w that we can decrease both peak and average power by using Golomb codes for compressing the scan vectors of IP cores. In addition, we s h o w that it is not necessary to use a separate CSR we can directly encode TD.
COMPRESSION METHOD AND TEST AR-CHITECTURE
We rst review Golomb coding and its application to test data compression in 11]. If the di erence vector T dif f is used for compression, the rst step is to derive it from TD, where TD = ft1 t 2 t 3 : : : t n g, is the (ordered) precomputed test set. The ordering is determined using a heuristic procedure described 11] . T dif f is de ned as follows:
T dif f = fd1 d 2 : : : d n g = ft1 t 1 t2 t 2 t3 ::: tn;1 tng:
where a bit-wise exclusive-or operation is carried out between patterns ti and tj.
In this work however, we encode TD directly, hence there is no need to generate T dif f . All the don't-care bits in TD are mapped to 0s to obtain a fully-speci ed test sequence.
The next step in the encoding procedure is to select the Golomb code parameter m, referred to as the group size. Once m is determined, the runs of 0s in the test data stream are mapped to groups of size m (each group corresponding to a run length). The mapping procedure for obtaining the codewords is described in 11].
The problem of determining the best ordering is equivalent to the NP-Complete Traveling Salesman problem. Therefore, a greedy algorithm is used to generate an ordering and the corresponding TD. Suppose a partial ordering t1t2 : : : t i has already been determined for the patterns in TD. To d etermine ti+1, w e calculate the Hamming distance H D (ti t j ) between ti and all patterns tj that have not been placed in the ordered list. We de ne H D (ti t j ) as the numberof0sin the pattern tj. We select the pattern tj for which H D (ti t j ) is maximum and add it to the ordered list, denoting it by ti+1. In this way, a fully-speci ed test pattern is obtained and the smallest number of 1s is added to the ordered vector sequence. We continue this process until all test patterns in TD are placed in the ordered list. Figure 1 illustrates the procedure for obtaining fully speci ed ordered TD.
An on-chip decoder decompresses the encoded test set TE and produces TD. Even though TD contains more patterns than test sets obtained after static compaction of ATPG vectors, the testing time is reduced since pattern decompression can be carried out on-chip at higher clock frequencies. As discussed in 11], the decoder can be e ciently implemented by a log 2 m-bit counter and a nite-state machine (FSM). The synthesized decode FSM circuit contains only 4 ipops and 34 combinational gates. For any circuit whose test set is compressed using m = 4 , the given logic is the only additional hardware required other than the 2-bit counter. This is especially the case if, unlike in 11], TD is directly used for encoding and a CSR is not required for decompression.
Since the decoder for Golomb coding needs to communicate with the tester, and both the codewords and the decompressed data can be of variable length, proper synchronization must be ensured through careful design. In particular, the decoder must communicate with the tester to signal the end of a block o f v ariable-length decompressed data. These and other related decompression issues are discussed in detail in 11].
POWER ESTIMATION FOR SCAN VEC-TORS
In this section, we examine the impact of test set encoding on power consumption during scan testing. We then show how p o wer consumption can be minimized by appropriately assigning binary values to the don't-care bits in TD and then applying Golomb coding for test data compression.
For a CMOS circuit, power consumption can be classied as either static or dynamic. Static power consumption, which i s c a u s e d b y l e a k age current, is usually negligible and therefore ignored. Dynamic power is consumed when the the outputs of circuit elements from high-to-low and low-tohigh transitions. This constitutes the predominant fraction of CMOS power consumption.
For scan vectors, the dynamic power consumption during testing depends on the number of transitions that occur in the scan chain as we l l a s o n t h e n umber of circuit elements that switch during the scan in and scan out operations. It is di cult to estimate the scan out power directly from the scan vector set since the test responses must be determined from the function of the core under test. Therefore, as in 7], we limit ourselves to the scan in power only and measure it in terms of the number of transitions in the scan vectors. We also use the weighted transitions metric introduced in 7] to estimate the power consumption due to scan vectors. This models the fact that the scan in power for a given vector depends not only on the number of transitions in it but also on their relative positions. For example, consider a scan vector v1v2v3v4v5 = 01000, where v1 is rst loaded into the scan chain. The 0-to-1 transition between v1 and v2 causes more switching activity in the scan chain than the 1-to-0 transition between v2 and v3.
The weighted transitions count metric is also strongly correlated to the switching activity i n t h e i n ternal nodes of the core under test during the scan in operation. It was shown experimentally in 7] that scan vectors have have higher weighted transition metric dissipate more power in the core under test.
Consider a scan chain of length l and a scan vector tj = t ? j 1 t ? j 2 : : : t ? j l , with t ? j 1 scanned in before t ? j 2 , and so on. The weighted transitions metric for tj, denoted W T M j , is given by W T M j = P l;1 i=1 (l ;i) (t ? TD contains n vectors t1 t 2 : : : t n then the average scan in power Pavg and peak scan in power P peak are estimated as follows: Pavg = P n j=1 P l;1 i=1 (l ; i) (t ? j i t ? j i+1 ) n P peak = max j2f1 2 ::: ng f l;1 X i=1 (l ; i) (t ? j i t ? j i+1 )g: If the peak power exceeds a threshold value, it can cause structural damage to the silicon or to the package. Likewise, elevated average power can also cause structural damage to the silicon, bonding wires or the package. It also adds to the thermal load that must be transported away from the device under test.
We next show how Golomb codes can be used to minimize the volume of test data and at the same time, minimize Pavg and P peak . Scan-in power is in uenced by the manner in which the don't-cares in TD are mapped to binary values. While Pavg and P peak can be minimized by c hoosing an appropriate mapping, such a mapping is not guaranteed to provide high test data compression. In fact, our experiments show that the encoded test sets in such cases are often larger than the uncompacted test sets. Instead, it is far more e cient to simply map all the don't-cares in TD to 0s as shown in Figure 1 . While this approach does not minimize Pavg and P peak , it provides signi cant reductions in power consumption, and at the same time, decreases the test data volume considerably. The fully-speci ed test set thus obtained is then compressed using Golomb codes.
For example, Table 1 shows two partially-speci ed scan vectors ti = 0 1 ; ; ; 1; ; ; ; 01 and tj = 0 1 ;1010; ; ; ; 1 with scan chain length l = 12, where ; denotes a don'tcare bit. If the don't-cares are mapped to binary values to minimize the weighted transition metric, then d ; ; ; ;d 0 , d 2 f0 1g, must be mapped to dddddd 0 . Similarly, d ; ; ; ; must be mapped to ddddd. This ensures that the few unavoidable transitions occur \late" during scan in. Table 1 shows the values of W T M i and W T M j and the Golomb codes for the corresponding fully-speci ed vectors (m = 4 ) . The weighted transitions metric is clearly higher if the don't-cares are always mapped to 0. However, Golomb coding is much more e ective in reducing test data volume if this strategy is used. Next we p r e s e n t the following theorem which c haracterizes the maximum WTC for a given test length n, scan chain length l, and the numberof1s(r) in the test set. The proof is omitted for conciseness. This yields the maximum value for the average power Pavg and it can be used to predict average power by using limited information about TD. The Figure 2 : Experimental results on test data compression using Golomb c o d e s . maximum value for the peak power is easier to derive|it simply equals l(l + 1 ) =2 as long as r l=2. Theorem 1. For a given test length n, s c an chain length l, and the number of 1s r in the test set, an upper bound on the average power is given by Pavg lr n ; r 2 n 2 + r 2n 3 ( r n + 1 ) :
EXPERIMENTAL RESULTS
In this section, we e v aluate the e ect of Golomb coding of TD on test data volume and power consumption during scan testing for the ISCAS 89 benchmark circuits. The experiments were conducted on a Sun Ultra 10 workstation with a 333 MHz processor and 256 MB of memory. W e only considered the large full-scan circuits with a single scan chain each. The test vectors for these circuits were reordered to increase compression. Figure 2 presents the experimental results for test cubes TD obtained from the Mintest ATPG program with dynamic compaction. In order to compare with 11], we also present compressed results obtained using the di erence vector sequences T dif f (TE1) for the same test sets. Figure 2 shows the sizes of TD, the size of the smallest encoded test set obtained after static compaction using Mintest, size of TE1 and the size of compressed test set obtained using TD with a l l X s m a p p e d t o 0 ( TE2).
As is evident from Figure 2 , TD yields better compression than T dif f in four out of the six cases. For these circuits, we a c hieve better compression without requiring a separate CSR. Therefore, there is a signi cant reduction in hardware overhead as compared to 11]. The results also show that ATPG compaction may not always be necessary for saving memory and reducing testing time. In ve out of the six cases, the size of the encoded test set is less than the smallest ATPG-compacted test sets known for these circuits. This comparison is essential in order to show that storing TE in ATE memory is more e cient than simply applying static compaction to test cubes and storing the resulting compact test sets. On average, the size of TE is 36.26% less than that of the compacted test sets obtained using Mintest.
We next present results on the peak and average power consumption during the scan-in operation. These results show that test data compression can also lead to signi cant savings in power consumption. As described in Section 3, we estimate power using the weighted transitions metric. Let
Uncompacted test sets with
Uncompacted test sets with don't-cares don't-cares mapped to 0s mapped to minimize WTM Peak Peak power Average Average power Peak Peak power Average Average power Circuit power reduction power reduction power reduction power reduction Figure 3 : Experimental results on peak and average power consumption. P C peak (P C avg ) be the peak (average) power with compacted test sets obtained using Mintest. Similarly, l e t P G peak (P G avg ) be the peak (average) power when Golomb coding is used by mapping the don't-cares in TD to 0s. Figure 3 compares the average and peak power consumption for Mintest test sets with TD when Golomb coding is used. The percentage reduction in power was computed as follows: Figure 3 shows that the peak power and average power are signi cantly less if Golomb coding is used for test data compression and the decompressed patterns are applied during testing. On average, the peak (average) power is 28.98% (75.89%) less in this case than for the Mintest test sets. Thus our results demonstrate that the substantial reduction in test data volume is also accompanied by signi cant reduction in power consumption during scan testing.
Next, we justify the the strategy of mapping all don'tcares in TD to 0s before Golomb coding. As discussed in Section 3, the power consumption can be minimized if the don't-cares are assigned to binary values to minimize the weighted transitions metric. Unfortunately, this strategy does not lead to any signi cant decrease in the test data volume|in fact, we found that in many cases, the encoded test set was larger than the original test set. We therefore carried out a set of experiments to demonstrate that if all don't-cares are mapped to 0s, the test data volume decreases substantially ( Table 2 ) and at the same time, power savings are signi cant.
Our experimental results for the larger ISCAS 89 circuits are shown in Table 2 . We note that while the average power consumption is greater compared to the \optimal" mapping of don't-cares, it is still signi cantly less than the power for ATPG-compacted test sets. In some cases, the di erence is as low as 4%, while on average, the average power consumption increases by only 8%. Likewise, the di erence in peak power consumption is only 9% on average. Nevertheless, compared to Mintest, we a c hieve 51% test data compression on average with 76% reduction in average power consumption for scan testing. This provides a strong justi cation for the proposed test data compression approach.
