Introduction
The systems-on-chip (SoC) design strategies help us to reduce the time-to-market and design cost for new products significantly. However, testing of SoC is a crucial and time consuming problem due to the increasing design complexity [1] . Therefore, the goal is to develop techniques for wrapper design, test access mechanism (TAM) design and test schedule that minimizes test application time under given constraints such as the number of test pins and power consumption. A number of approaches have addressed wrapper design [2, 3, 4] which are IEEE 1500 [5] compliant. Similarly, several TAM architectures have been proposed such as TestBus [6, 7] , TESTRAIL [8] , transparency based TAM [9, 10, 11] . Moreover, many approaches for test scheduling problem have been proposed [12, 13, 14, 15, 16, 17] .
However, these previous approaches are applicable only to single-clock domain SoCs that consist of embedded cores working at the same clock frequency during test. Today's SoC designs in telecommunications, networking and digital signal processing applications consist of embedded cores working with different clock frequencies. The clock frequency of some embedded cores during test is limited by its scan chain frequencies. On the other hand, other cores may be testable at-speed in order to increase the coverage of non-modeled and performance-related defects. The atspeed testable cores might be non-scan designed sequential circuits and require functional vectors or ordered test sequence at the rated clock frequency. Moreover, there also exists a frequency gap between each embedded core and ATE used to test the SoC. From these facts, we conclude that the previous approaches have the following two problems: 1) in the case when test clock frequency of a core is higher than that of ATE, the ATE cannot provide test sequences at the same speed of the test clock frequency of the core, and 2) in the case when test clock frequency of a core is lower than that of ATE, testing of the core by lowering the frequency of ATE does not make use of ATE capability effectively. Therefore, it is necessary to develop a technique that can solve the above problems for the multi-clock domain SoCs.
Recently, virtual TAM based on bandwidth matching [18] has been proposed in [19] to increase ATE capability when the clock frequency of a core is lower than that of ATE. Xu et al. extended the virtual TAM technique to the multi-frequency TAM design to reduce the test time for the single-clock domain SoCs in [22] . Moreover, a wrapper design for cores with multiple clock domains was proposed in [20, 21 ] to achieve at-speed testing of the cores by using virtual TAM technique. However, the test scheduling problem for the multi clock domain SoCs is not addressed in these literatures.
To the best of our knowledge, this paper gives a first discussion and a formulation of the test scheduling problem for multi-clock domain SoCs. We present a wrapper and TAM design for multi-clock domain SoCs and propose a test scheduling algorithm to minimize test time under power constraint. In the proposed method, we use virtual TAM for each core to solve a frequency gap between each core and a given ATE while the approach in [22] uses a virtual TAM for each test bus (i.e., all the cores assigned to the same test bus must be tested at the same frequency). Therefore, the proposed method in this paper has more flexibility for the test scheduling. Moreover, we also use virtual TAM in order to reduce the power consumption of the cores during test. Therefore, the proposed method is effective for the powerconstrained test scheduling. Experimental results show the effectiveness of our method not only for multi-clock domain SoCs, but also for single-clock domain SoCs with power constraints.
The rest of this paper is organized as follows. We discuss multi-clock domain SoCs in Section 2. Section 3 shows a power-conscious virtual TAM technique. After formulating We consider that an SoC consists of the maximum allowed power consumption and cores working at different test frequencies. However, we assume that each core has been designed with single-clock domain during test. For each core, a maximum test frequency and a power consumption at the given maximum frequency are given. Each core also has an information about the requirement of at-speed testing. atspeed(c i ) = yes means that c i must operate at freq(c i ) during test (i.e., we cannot change the test frequency of c i for test scheduling). atspeed(c i ) = no means that c i can be tested at lower frequencies than freq(c i ) (i.e., we can decrease the test frequency of c i for test scheduling). Moreover, each core has a wrapper list that consists of possible wrapper designs for the core. Each wrapper de- 
Virtual TAM for Power Minimization
The frequency gaps between ATE and cores can be solved by using virtual TAM techniques based on bandwidth matching. When freq(c i ) (clock frequency of core c i during test) is higher than f AT E (clock frequency of ATE) ( Fig. 2(a) ), we insert a TDM (test data multiplexing) circuit between ATE outputs and the core inputs, and multiplex freq(c i )/f AT E · m TAM wires at f AT E into m virtual TAM wires at freq(c i ). On the other hand, when freq(c i ) is lower than f AT E ( Fig. 2(b) ), we insert a TDdeM(test data de-multiplexing) circuit between ATE output and the core inputs, and de-multiplex n TAM wires at
To observe test responses, we need to insert TDM/TDdeM between the core output and ATE inputs in the similar fashion.
In this paper, we also utilize virtual TAM technique to reduce power consumption of a core while maintaining the same test time of the core. The dynamic power P (k) (which is the dominant source of power consumption in CMOS circuits) consumed in the circuit on application of consecutive two test vectors (V k−1 , V k ) is as follows [23] .
Here, f is the clock frequency, V DD is the power supply voltage, C i is the output capacitance at node i and S i (k) is the number of switchings provoked by V k at node i. From the equation (1), we observe that the power consumption of a core during test can be reduced by lowering its test frequency. However, this increases test time of the core proportionally to the power reduction ratio. Here, we insert TDdeM circuit between the ATE outputs and the core inputs. Then, more virtual TAM wires become available for the core, and test time can be reduced. In the best case, we can achieve the same test time with 50% reduction of power consumption for a core by using the above power-conscious virtual TAM technique. For example, we consider the wrap- per design for core7 in d695 from ITC'02 SoC benchmarks [24] . Table 1 shows that we can achieve a 50% power reduction with an 1.4% test time overhead by decreasing the frequency from 50MHz to 25MHz and increasing the number of virtual TAM wires from 10 to 20.
Problem Formulation
We formulate the power-constrained test scheduling problem for multi-clock domain SoCs P mcds that we address in this paper as follows. 
Scheduling Algorithm
This section presents a heuristic algorithm for P mcds that consists of the following three stages: 1) testability analysis, 2) test scheduling at time 0 for cores with large amount of test data, and 3) test scheduling based on Best Fit Decreasing (BFD) heuristic [25] for remaining cores. The example of the generated test schedule is shown in Figure 3 . The shaded cores and the unshaded cores in Figure 3 are scheduled in stage 2 and stage 3, respectively. The following subsections describe the details of each stage.
Testability Analysis (Stage 1)
If MCDS cannot satisfy the following two conditions for the given parameters: f AT E and W max , then there is no solution for P mcds . For each c i ∈ C such that atspeed(c i ) = yes,
For a core c i such that atspeed(c i ) = yes, we cannot change the test frequency freq(c i ) and power consumption power(c i ) during test. Therefore, the core that can- not satisfy equation (2) exceeds a given power limitation even if it is tested alone. Moreover, as explained before, TDM/TDdeM circuits can be uniquely determined when f AT E , a wrapper design r ij and a test frequency for c i are given. Therefore, the core that doesn't satisfy equation (3) cannot be assigned enough wrapper pins to achieve at-speed test at freq(c i ).
Test scheduling at time 0 with minimum test frequency (Stage 2)
This stage consists of the following three steps.
Step exceed W max , and 3) T ci LB is less than T LB /|C|. Here, |C| denotes the number of cores in the SoC. The third condition can prevent us from scheduling cores with small amount of test data to time 0. Instead of scheduling such small cores at time 0, Step 3 re-designs wrappers and re-calculates test frequencies for the cores scheduled in this step to reduce the overall test time of the SoC. Figure 4 shows a current test schedule generated after Step 2. In Figure 4 (a), the horizontal axis denotes the test time, and the vertical axis denotes the power consumption used in each test time. In Figure 4 (b), the horizontal axis denotes the test time, and the vertical axis denotes the number of test pin used in each test time.
Step 3: re-calculate test frequencies and re-design wrappers for cores scheduled at time 0
There exists a case where P 0 (power consumption at time 0) does not reach P max after Step 2 ( Fig. 4(a) ) since Step 2 stops the above three conditions. In this case, we find a core c i that satisfies all the following conditions. according to equation (7) . The fourth condition (equation (11)) can prevent one core from dominating power consumption, and help us to increase the test concurrency at time 0. This process repeats until 1) P 0 does not exceed P max and 2) there exists a core that satisfies the above conditions. Figure 5(a) shows a result where we apply this process to the current schedule generated after Step 2 corresponds to Figure 4 . In this figure, frequencies for core 2, 3, 4 and 6 are increased. Consequently, the test time for these cores are reduced. Figure 6 . An example of test scheduling for core 5.
Similarly, there exists a case where W 0 (pin usage at time 0) does not reach W max after Step 2. In this case, we find a core c i with maximum test time, then assign 1 test pin to c i . This process repeats until W 0 does not exceed W max . Figure 5(b) shows a result where we apply this process to the current schedule corresponding to Figure 5 (a).
Test scheduling for remaining cores based on BFD (Stage 3)
In this stage, we determine a test schedule for the remaining cores based on BFD heuristic. First, we pick a core c i in the descending order based on T ci LB . Then, we find the best start time, wrapper design and test frequency for c i such that the total test time of the given SoC is minimized as follows. This process repeats until all the remaining cores are scheduled in the descending order based on T ci LB . Through the above processes, we can generate a final test schedule.
Experimental Results
In Section 6.1, we show experimental results for a multiclock domain SoC with power constraint. Section 6.2 presents experimental results for single-clock domain SoCs with power constraint ("d695" and "h953" from ITC'02 SoC benchmarks [24] ) in order to show the effectiveness of our approach compared to previous works. All the experimental results can be obtained within 0.1 sec. on a SunBlade 2000 workstation (1.05 GHz with 8GB RAM).
Results for a multi-clock domain SoC
Since there exists no approach that has tackled the test scheduling problem for multi-clock domain SoCs, it is difficult to compare with previous works. We have decided to analyze the trade-offs of the proposed method in terms of the number of available test pin, the clock frequency of ATE, maximum allowed power consumption and test time for a hypothetical multi-clock domain SoC. Table 2 shows the multi-clock domain SoC MCDS 1 used in this experiment. This SoC consists of 14 cores. First 10 cores are from "d695" in ITC'02 SoC benchmarks. "flexible( ≥ 2)" in column "wrapper list" denotes that we can design any wrapper (wrapper with any number of test pins) by the procedure proposed in [2, 3] . We use the same power consumption shown in [15] , and assume that freq(c i ) = 50 MHz and atspeed(c i ) = no for these 10 cores. The wrappers for core 11 and core 12 are already designed (i.e., 64 pins, 32 pins, respectively). We assume that these two cores are tested at higher frequencies than other cores, and atspeed(c i ) = yes. Core 13 and core 14 are copies of core 7 and core 5, respectively. However, we assume that these two cores are tested at lower frequencies than other cores.
For this SoC, Table 3 shows test time results when f AT E = 200MHz, 100MHz and 50MHz. In this table, the test time results are shown as "µsec.", and "untestable" denotes that there exists no solution for the given parameters. In this SoC, since core 11 should be tested at 100MHz with 64 pins, we observe that there exists no solution for three cases: 1) f AT E = 100MHz and W max = 32, 2)f AT E = 50MHz and W max = 32, and 3) f AT E = 50MHz and W max = 64. We also observe that test time depends on the product of f AT E and W max . Therefore, when we use a high speed ATE, we can test SoCs with small number of test pins. On the other hand, even when we use a low speed ATE, we can achieve the same test time by using more test pins. From this results, the designer can decide the number of test pins and the speed of the test pin considering the total cost for them.
Comparison with other approaches
The proposed test data de-multiplexing technique is also effective for the power-constrained test scheduling of the single-clock domain SoCs as well as for that of the multiclock domain SoCs. In order to show the effectiveness of our approach compared to previous works, we present experimental results for the single-clock domain SoCs with power constraint. We use "d695" and "h953" from ITC'02 SoC benchmarks [24] as the single-clock domain SoCs by assuming that f AT E = 50 MHz, and freq(c i ) = 50 MHz and atspeed(c i ) = no for all core c i ∈ C. This is because only these two SoCs have power information in the benchmarks (for "d695", we use the same power consumption shown in [15] ). Table 4 shows the test time results of the proposed method and the previous power-constrained approaches [15, 16] which are applicable only to the singleclock domain SoCs. In this table, test time results are shown as the number of clock cycles. "NA" denotes that the approach is not applicable for the constraint. "-" denotes that no result is shown for the constraint in the approach. For d695, we observe that the proposed approach can achieve a 6.9% reduction in average test time compared to [15] . Moreover, for h953, we observe that the proposed approach can achieve the lower bound (119357) on the SoC test time [14] under all power constraints. From these results, we conclude that the proposed power-conscious virtual TAM technique and test scheduling algorithm are also effective for single-clock domain SoCs.
Conclusions
This paper has presented a power-conscious wrapper and TAM design for multi-clock domain SoCs, and proposed a test scheduling algorithm to minimize test time under power constraint. To the best of our knowledge, a test scheduling problem for multi-clock domain SoCs has been addressed and formulated for the first time in this paper. Moreover, we have presented a technique to reduce power consumption of a core during test while maintaining the test time by utilizing virtual TAM technique which is applicable to both single and multi clock domain SoCs.
