Many system-on-chip (SoC) integrated circuits contain embedded cores with different scan frequencies. To better meet the test requirements for such heterogeneous SoCs, leading tester companies have recently introduced port-scalable testers, which can simultaneously drive groups of channels at different data rates. However, the number of tester channels available for scan testing is limited; therefore, a higher shift frequency can increase the test time for a core if the resulting test access architecture reduces the bit-width used to access it. We present a scalable test planning technique that exploits port scalability of testers to reduce SoC test time. We compare the proposed heuristic optimization method to two baseline methods based on prior works that use a single scan data rate for all embedded cores. We also propose a power-aware test planning technique to effectively utilize port-scalable testers under constraints of test power consumption. Experimental results are presented for power-aware test scheduling to illustrate the impact of power constraints on overall test time.
INTRODUCTION
Recent advances in technology have led to a tremendous increase in the complexity of system-on-chip (SoC) integrated circuits. Today's heterogeneous SoCs consist of embedded cores that not only operate in multiple clock domains [Goel et al. 2004; Lin and Thompson 2003; Schmid and Knablein 1999] , but also (due to differences in performance levels, design styles, and scan insertion methods) differ in their maximum scan clock frequencies [Vranken et al. 2003 ]. The difference in scan clock frequencies between embedded cores can also arise due to the integration of various cores derived from different, older-generation SoCs into a single, current-generation SoC [Chickermane et al. 2001] . The test time for such SoCs can be reduced by testing the embedded cores at data-transfer rates that match their maximum scan frequencies.
To test embedded cores at different scan frequencies, the test data needs to be simultaneously transported at multiple data rates on the tester channels. In order to meet this requirement, automatic test equipment (ATE) vendors have introduced a new class of testers that can simultaneously drive tester channels at different data rates [Dorsch et al. 2002] . Examples of such ATEs include the Agilent 93000 series [Agilent Technologies 2002; Agilent 93000 2008] and the Tiger system from Teradyne [Teradyne Technologies 2008] , which are based on port scalability and test processor-per-pin architecture. Port scalability allows every port of a tester to be configured at a desired data rate, where each port typically consists of multiple channels.
Modular testing of embedded cores offers a promising test solution for SoCs [Goel et al. 2004; Zorian et al. 1999] . It also lends itself well to the scenario of SoC testing using multiple scan data rates. It involves the isolation of an embedded core from surrounding logic using a test wrapper, and the design of a test access mechanism (TAM) to deliver test data from the I/O pins of the SoC. In many test access architectures, the SoC-level TAM wires are divided into fixedwidth TAM partitions [Goel and Marinissen 2002; Huang et al. 2002; Iyengar et al. 2003b; Larsson and Peng 2002] . In the scenario being considered here, all wires of a TAM partition belong to the same ATE port, which can be configured for a predefined scan data rate. TAM optimization methods published in the literature do not handle the general problem of designing TAM architectures that are driven by port-scalable ATEs.
The problem of designing a TAM architecture to minimize the SoC test time has been shown in the literature to be N P -hard . Therefore, efficient heuristic techniques have been developed for SoC test planning and TAM optimization [Goel and Marinissen 2002; Gonciari et al. 2003; Huang et al. 2002; Iyengar et al. 2003b; Larsson and Peng 2002; Zhao and Upadhyaya 2003; Yoneda et al. 2007; Yu et al. 2007; Xu and Nicolici 2005] . However, it is assumed in all these methods that at any instant in time, the ATE provides test stimuli to the SoC at a single data rate. As a result, existing optimization techniques cannot readily exploit the availability of simultaneous multiple data-transfer rates from the ATE to the SoC. In this work, we focus on the problem of designing an optimized TAM architecture that can benefit from the availability of port scalability in ATEs. As in Goel and Marinissen [2002] , Iyengar et al. The testing of embedded cores with multiple scan data rates was recently addressed in Nicolici [2004a, 2004b] , Sehgal et al. [2004] , and Sehgal and Chakrabarty [2007] . The key idea in Xu and Nicolici [2004a] is to use bandwidth matching and to determine appropriate scan frequencies for the TAM partitions to reduce test time; however, no limits are set on the maximum scan frequencies of the cores. In Sehgal and Chakrabarty [2007] , the number of data rates for the available ATE channels is set to two. The preceding assumptions are too restrictive in practice. In this article, we consider a more general scenario in which cores with different scan frequencies can be driven by ATE channels operating in a data-rate range given by the range of scan frequencies for the cores.
We optimize the TAM architecture for a set of cores with different maximum scan data rates. We compare this approach to a baseline case in which all cores are tested at their maximum scan frequencies. We use an iterative descent procedure that minimizes the testing time by jointly optimizing the widths of TAM partitions, TAM frequencies, and assignment of cores to TAM partitions. We also present a solution to this problem based on integer linear programming (ILP). Although ILP yields optimal results, it is computationally infeasible for large problem instances. Nevertheless, the ILP model can be used to evaluate the heuristic for small problem instances. We derive a lower bound on the SoC testing time and list these bounds for several ITC'02 SoC test benchmarks . We also present experimental results for several ITC'02 benchmark SoCs.
While the testing of multiple cores in parallel in a core-based SoC results in test schedules with low test times, the concurrent testing of these cores results in increased power consumption during test application. The permissible power envelope is often exceeded when power constraints are not considered during test scheduling [Chou et al. 1997] ; this can lead to thermal runaway, or cause severe irreparable damage to the SoC. Testing an SoC can lead to extremely high switching activity, more than when the circuit is in its functional mode [Larsson and Peng 2001; Zhao and Upadhyaya 2003] . It is therefore important to consider power constraints while designing test schedules for embedded cores with multiple scan data rates. We therefore formulate a TAM design and test scheduling problem for cores with different scan frequencies, and we extend the problem formulation to include constraints that are placed on the test power.
The rest of this article is organized as follows. In Section 2, we define the test planning problem that exploits the availability of port-scalable ATEs. We develop an integer linear programming model for this problem and derive a lower bound on the SoC testing time. In Section 3, we present a scalable heuristic approach to solve this problem. Experimental results for several ITC'02 benchmark SoCs are also presented. Section 4 describes a power-aware heuristic scheduling method to solve the test planning problem. Experimental results for the ITC'02 SoC test benchmarks are presented to illustrate the impact of 53:4
• A. Sehgal et al. power constraints on the test planning problem. Finally, we present conclusions and directions for future work in Section 5.
TAM ARCHITECTURE OPTIMIZATION
In this section, we formulate the TAM optimization problem when port-scalable testers are used. We develop an ILP model to solve this problem and derive a geometric lower bound on the test time.
Problem P port−scalable . Given the test-data parameters for N embedded cores in an SoC, the maximum scan frequency f i for each core i (1 ≤ i ≤ N ) and the SoC-level TAM width W determine: (i) the number of TAM partitions B; (ii) for each TAM partition j , the width w j and the scan frequency f j , 1 ≤ j ≤ B; and (iii) the assignment of cores to TAM partitions. The aforesaid assignment of cores must be such that: (a) the frequency of each TAM partition does not exceed the maximum frequency of any core assigned to this TAM partition; (b) the sum of widths of TAM partitions does not exceed the total TAM width W ; and (c) the overall test time of the SoC is minimized.
The test set parameters for each core include the number of primary inputs, primary outputs, bidirectional I/Os, test patterns, scan chains, and scan-chain lengths. The cores are assumed hard, that is, the number and length of scan chains are fixed prior to test planning. These parameters are used to design a wrapper for the cores. The Design Wrapper algorithm from is used to design a wrapper and determine the testing time for a core for a given TAM width. Note that if the scan frequencies of all cores are equal, P port−scalable is equivalent to the original N P -hard TAM-design problem described in . Hence P port−scalable is at least N P -hard.
The testing time of core i on a TAM partition of width w j is expressed as
, where si i , so i , and p i are the maximum scan-in time, maximum scan-out time, and the number of test patterns for core i, respectively [Marinissen et al. 1998 ]. The testing time is expressed in units of clock cycles.
The test time T i (w j , f ) for core i at frequency f on a TAM partition of width w j is defined as T i (w j , f ) = T i (w j )/ f , where the testing time is expressed in units of μs if f is given in MHz. The overall test time of an SoC is the maximum of the test time over all TAM partitions. Let x i j = 1, if core i is assigned to TAM j , otherwise x i j = 0. The problem P port−scalable can now be stated as follows.
} subject to the following.
(1)
B j x i j = 1, 1 ≤ i ≤ N , namely, every core is connected to only one TAM partition; (2) B j w j = W , namely, the sum of widths of TAM partitions does not exceed W ; (3) f j = min i {{ f i · x i j } \ {0}}, where {A \ {0}} denotes the set difference between A and {0}; and (4) w j ≤ w max , 1 ≤ j ≤ B, where w max is a user-defined limit on the size of a TAM partition.
Each TAM partition j has w j wires that are connected to w j tester channels belonging to the same ATE port. This port is configured to operate at frequency f j . There is an upper limit on the number of channels that can be included in an ATE port, hence the width of each TAM partition cannot exceed an upper limit of w max . For a typical port-scalable ATE such as the Agilent 93000, w max = 64 [Khoche 2001 ].
Next we consider a special case of P port−scalable which we refer to as P port−scalable . This special case is introduced to optimally solve specific instances of P port−scalable . A solution to P port−scalable addresses the assignment of cores and frequencies to TAM partitions. We assume here that TAM partitions have already been determined. We also present an ILP model for this problem.
The problem P port−scalable can also be shown to be N P -hard using the techniques presented in Chakrabarty [2001] . However, it can be solved exactly for small problem instances using an ILP model. The solution P port−scalable can be used to optimally determine the assignment of cores and frequencies to TAM partitions. Let f i j denote the frequency at which core i is tested if it is assigned to TAM partition j . Let τ i j = 1/ f i j denote the corresponding time period. Let τ i = 1/ f i be the minimum possible period of the scan clock for core i. A mathematical programming model for P port−scalable can be derived as follows.
Note that the preceding objective function is nonlinear due to the product term x i j · τ i j . We linearize it by replacing it with a new integer variable y i j ( y i j ≥ 0) and adding the following three constraints for every such product term: (i) y i j − T max ≤ 0, where T max is an upper bound on the minimum timeperiod limit for all cores; (ii) −τ i j + y i j ≤ 0; and (iii)
The new variables and constraints yield the following ILP model.
We use this ILP model with the TAM-width partitioning approach from to solve P port−scal abl e . The P PAW Enumerate procedure described in enumerates unique TAM partitions for given values of B and W . In this work, we use the P PAW Enumerate procedure with the ILP model for P port−scalable . We next derive a lower bound on the SoC testing time, using a geometric argument. The testing times for a core in the SoC can be represented using a set of rectangles. A set R i of rectangles for core i (1 ≤ i ≤ N ) is determined such that the height and width of each rectangle correspond to a TAM width and the corresponding test-application time for the core, respectively. The TAM optimization problem can now be formulated in terms of rectangle packing as follows: Select one rectangle from each set R i , 1 ≤ i ≤ N , and pack the selected rectangles into a bin of fixed height, such that no two rectangles overlap and such that the width to which the bin is filled is minimized. Even though this problem statement addresses a flexible-width TAM architecture as in Iyengar et al. [2003b] , it can be used to derive a lower bound.
The area of a bin, with the width representing total testing time T and the height representing total TAM width W , is given by T × W . Each core yields a set of rectangles of different areas. Let R min i ⊆ R i be the area of the minimumarea rectangle for core i. Let the area of a rectangle representing core i being tested at TAM width w be given by
We next show that the minimum-area rectangle for each core is a rectangle of height 1 and width of
A lower bound on the testing time T i (w, f ) for core i on a TAM partition of width w and frequency f can be expressed as
The numerator of the first term inside the parenthesis on the righthand side of the previous inequality represents the total test-data volume to be applied to the core; it is independent of the number of TAM wires used to apply the test data or the scan frequency of the TAM wires. Hence for any core i,
where v is the total test-data volume for the core. Comparing the expressions for R i (w, f ) and
We also note that the minimum-area rectangles for the cores might not fill the bin of area T × W perfectly, owing to variation in the sizes of rectangles. As a result, there may be some unfilled space in the bin. Let us denote the total area of the unfilled space in the bin by , where ≥ 0. Now, we know that the total area of the bin cannot be less than the sum of minimum-area rectangles of all cores in the SoC and the sum of all the unfilled space in the bin. Thus
which implies that
Let the lower bound obtained from Eq.
(1) be denoted by LB 1 . We obtain another lower bound LB 2 from Chakrabarty [2001] as follows: 1 ≤ i ≤ N . Specifically, LB 1 is more accurate for smaller TAM widths. However, for larger values of W , the LB 2 is tighter. Hence, the overall lower bound LB T is determined as max{LB 1 , LB 2 }.
OPTIMIZATION PROCEDURES
In this section, we explain the heuristic algorithm used to solve P port−scalable . The algorithm starts with an initial solution and then improves it in an iterative manner, using four iterative descent procedures (IDPs). An IDP reduces (descends) the initial cost, which refers here to the SoC-test time, by reducing the cost with each iteration. It continues to iterate until the cost increases, in which case it exits and outputs the solution from the previous iteration as the final solution. Typically, every IDP has an abort condition to prevent an infinite number of iterations. The main steps of the algorithm, as shown in Figure 1 , are briefly outlined as follows.
1. In procedure TAM Initialize, an initial partition of the total TAM width and the frequencies of TAM partitions are determined based on the scan frequencies for the cores. 2. In procedure Assign Core, a modified best-fit decreasing (BFD) algorithm is used to make the initial core assignments to TAM partitions. 3. Four iterative descent procedures are nested together in a loop; these are executed in an iterative manner, as long as the testing time decreases with each iteration. The four IDPs jointly optimize the TAM-partition widths, TAM-partition frequencies, and the assignment of cores to TAM partitions. 
Initial Solution
The TAM Initialize procedure creates B partitions, where B is varied from 1 to B max . The first B − 1 partitions have width W/B , and partition B has width W − (B − 1) × W/B . These partitions are assigned TAM frequencies as follows: (i) a list of cores, sorted by their maximum scan frequencies in ascending order, is created; (ii) for every TAM partition j , the frequency of the core with index (( j − 1) × N /B + 1) in the sorted list is selected; for example, TAM partition 1 is assigned the frequency of that core with index 1. These two steps ensure that the TAM-partition frequencies are evenly distributed between the frequency of the core with lowest maximum scan frequency and that with the highest. It also results in n TAM partitions having the same scan frequency if more than n × N /B + 1 cores have the same maximum scan frequency. Next, procedure Assign core assigns each core to one of the B TAM partitions such that each core is tested at a frequency lower than or equal to its maximum scan frequency. The steps of this procedure are as follows. While not all cores have been assigned to a TAM partition: (i) find a TAM partition TPART min with the lowest testing time among all TAM partitions; and (ii) from all cores with maximum scan frequencies greater than the frequency of TPART min , find the core with the maximum test time on TAM partition TPART min . There can be instances in which TAM partition TPART min has a frequency higher than the maximum scan frequency of all cores not yet assigned to TAM partitions. In such cases, the next TAM partition with minimum testing time is determined. The TAM architecture obtained from TAM Initialize and Assign core is now used as an initial solution for the IDPs.
Iterative Descent Procedures
The Split TAMs IDP optimizes the TAM frequencies, TAM widths, and core assignments, based on the initial solution. All TAM partitions have one or more frequency-bottleneck cores, which are those cores having the minimum scan frequency among all cores assigned to this TAM partition. The Split TAMs procedure displaces frequency-bottleneck cores from those TAM partitions with maximum testing time, and places them on a separate TAM partition operating at the bottleneck frequency. Figure 2 illustrates the Split TAMs procedure. The main steps of this procedure are as follows.
(1) Identify a TAM partition TPART max that has the maximum testing time.
A TAM partition with maximum testing time is also referred to as the bottleneck TAM partition. Update the testing time of this newly formed TAM partition, and recompute the maximum testing time of the SoC. (6) If the testing time has not exceeded the original testing time of TPART max from step 1, return to step 1.
The Core Shuffle IDP jointly optimizes the core assignments and scan frequencies of the TAM partitions by shuffling the core assignments to TAM partitions. (The widths of TAM partitions remain unchanged.) The main steps of the procedure are as follows.
(1) Identify TAM partitions TPART max and TPART min that have the maximum and minimum testing time, respectively. (2) Identify all cores assigned to TPART max that have a maximum scan frequency greater than the scan frequency of TPART min . These cores are "compatible" with TAM partition TPART min . (3) If there are no compatible cores, replace TPART min with a TAM partition with the next-lowest testing time. If TPART min and TPART max point to the same TAM partition, exit the procedure; otherwise repeat step 2. (4) From the set of compatible cores, select a core that has the maximum testing time on TPART max among those cores that can be assigned to TPART min . This choice should not cause the testing time of TPART min to exceed the initial testing time of TPART max from step 1. (5) Displace the selected core from TPART max and the set of compatible cores.
Update TAM-partition testing times. (6) Repeat steps 4 and 5, until no compatible core can be assigned to TPART min , without causing the testing time of TPART min to exceed the initial testing time of the SoC from step 1. (7) Update the frequency of TPART max to the scan frequency of that core with the minimum scan frequency. If, in step 5, frequency-bottleneck cores are displaced, the scan frequency of TPART max increases. Update testing times for TPART max and TPART min . The Redistribute TAM IDP optimizes the TAM widths of TAM partitions. It removes slack TAM wires from nonbottleneck TAM partitions, and assigns them to the bottleneck TAM partitions. The main steps of the IDP are as follows.
(1) Identify the bottleneck TAM partition TPART max .
(2) Find a nonbottleneck TAM partition TPART min which has the minimum testing time. (3) Remove slack TAM wires from TPART min and merge them with the bottleneck TAM partition. (4) Update testing time; if the testing time has reduced over the initial testing time, return to step 1, otherwise exit the procedure.
Thus, the Redistribute TAM IDP continues to remove slack TAM wires from the bottleneck TAM partition, while the testing time of the nonbottleneck TAM partitions does not exceed the initial testing time of the SoC.
The Merge TAMs IDP merges two TAM partitions to reduce the test time of those cores belonging to them by offering a greater bit-width. However, it causes the merged TAM partition to operate at the minimum of the scan frequencies of the two merged TAM partitions. The main steps of this procedure are as follows. 
Experimental Results
We now present experimental results for five ITC'02 benchmark circuits. We compare the results of the heuristic approach to the ILP-based approach, the baseline case, and the derived lower bounds on test time. In the first baseline case, the cores are tested at their maximum scan frequency. In this case, the number of TAM partitions is equal to the number of unique maximum scan frequencies for the embedded cores. The results for the baseline case are obtained using the TAM optimization technique from Iyengar et al. [2003a] . For the second baseline scenario, we consider an SoC with a fixed number of TAM partitions; the assignment of cores to TAM partitions is obtained using methods presented in Iyengar et al. [2003a] . The frequency of a TAM partition is set to the lowest scan frequency of those cores assigned to this TAM partition. In this article, we compare our methods with SoCs designed with two (TAM2) and three (TAM3) TAM partitions. We first present results for the ILP-based approach for two benchmark SoCs with a small number, of cores, namely, d695 and a586710.
In the absence of scan-frequency information for cores in the ITC'02 benchmarks, we use a random number generator to obtain the maximum scan frequencies for the cores. The random number generator is used to select frequencies from a set of predefined scan frequencies for each core. For the two smaller SoCs, d695 and a586710, we use two scan frequencies of 40 MHz and 80 MHz. For the three larger SoCs, p22810, p34392, p93791, the random generator chooses frequencies from a larger set of scan frequencies. The ILP-based approach is run only for B = 2 and B = 3 because the problem size grows exponentially in B.
The results for d695 and a586710 are shown in Table I . The set of maximum scan frequencies for the cores is shown in the last row of the table (element i if the set corresponds to the maximum scan frequency of core i). We denote the test time obtained using the ILP-based approach as T ILP . The test times for the first baseline scenario are denoted as T b . Further, T b T AM2 and T b T AM3 represent the test times obtained using the second baseline method for an SoC with two and three TAM partitions, respectively. The lower bounds are obtained using (3) and by assuming that every core is tested at its maximum scan frequency. The ILP-based approach outperforms the heuristic (T ) and the baseline scenarios in most cases. However, the runtime for d695 with B = 3 ranges from 37 minutes to 192 minutes for W ranging from 16 to 32. It does not reach a solution for TAM widths greater than 40 for B = 3. For a586710, the runtime is less than 10 minutes for B = 3 for all values of W . The proposed heuristic and the baseline case require less than 10 seconds for all values of W ≤ 64 for all the ITC'02 benchmarks. The heuristic approach provides an optimal solution, that is, same test time as the ILP-based method, for several cases (W = 32 and W = 64 for d695, and W = 56 for a586710). For other cases, its testing time is no more than 20% higher than the optimum test time. For d695 and a586710, the test time with three TAM partitions for the second baseline is less than that for the proposed method for some values of W . We attribute this to the fact that d695 and a586710 are simple designs. As expected, the proposed method outperforms the baseline methods for more realistic benchmarks (see Table II ). For W = 32, the second baseline with B = 3 leads to lower test time than the ILP method (B = 2). This shows that the minimum test time with B = 2 is higher than the test time obtained using the baseline method for B = 3.
Next we present results for SoCs p34392, p22810, and p93791 in Table II . The results are presented for two frequency ranges of 40 MHz to 200 MHz, and 10 MHz to 50 MHz, respectively. We performed experiments for sets of five scan frequencies and nine scan frequencies in the two frequency ranges. Using the proposed approach, the reduction in testing time over the first baseline case is as high as 52.54% (in one case) for the three large "p" SoCs from Philips. Since there are five (nine) distinct frequencies, the number of TAM partitions is limited to five (nine) in the first baseline case. However, from these experimental results, {40, 80, 40, 40, 80, 80, 40} T ILP :
× 100; T 3 :
× 100.
we observe that even when the first baseline case has the same number of TAM partitions as the proposed approach, the latter results in lower SoC test times. This implies that the frequencies of TAM partitions and the assignment of cores to TAM partitions have a major impact on the overall test time of the SoC. 
POWER-AWARE TEST PLANNING
The testing of multiple cores in parallel in a core-based SoC results in test schedules with reduced test times. However, the concurrent testing of these cores often results in increased test power. The permissible power envelope is often exceeded when power constraints are not considered during test scheduling; this can cause severe irreparable damage to the SoC. It is well known that scan testing and concurrent testing of embedded cores lead to high switching activity which can be several times higher than that for functional operation [Larsson and Peng 2001] . Since test power is directly proportional to the frequency of test application, the use of port-scalable testers to reduce test time is expected to lead to higher test power. Test scheduling under power constraints was first presented in Chou et al. [1997] . Power-constrained test scheduling for an SoC was addressed in Zhao and Upadhyaya [2003] , Larsson and Peng [2001] , and , and more recently in Su and Wu [2004] , Larsson and Peng [2006] , and Samii et al. [2006] . Power constraints were also considered in Huang et al. [2002] , but only for a flexible-width TAM architecture; the approach presented in Huang et al. [2002] cannot be applied to fixed-width and multifrequency TAMs.
We present a new test planning approach that takes into account the different scan frequencies of individual cores to exploit the port scalability of testers. While the use of port-scalable testers helps to reduce the overall test time of the SoC, operating the different cores of these heterogeneous SoCs at multiple data rates can cause overheating of the device, due to violation of the permissible power envelope P max . The parameter P max refers to the maximum power consumption of the device during test that allows for proper circuit operation. If a core i is assigned to a TAM partition j , then the power consumption of the core during test can be represented as P i j , where
, and P * i is the power consumed by core i when operating at a frequency f * i . The dynamic power consumption of a device is directly proportional to the frequency of operation, and the dynamic power is the dominant power component during test [Samii et al. 2006] . We now present a heuristic to solve P PA port−scalable , which is a power-aware version of the problem P port−scal abl e presented in Section 3.
4.1 Power-Aware P port−scalable : P PA port−scalable
In this section, we present a heuristic algorithm to solve P PA port−scalable . The heuristic algorithm determines the cores assigned to TAM partitions and the frequency of TAM partitions. It also determines the order in which the cores are tested on the TAM partitions. The sequence of procedures adopted in this section to solve the problem is built upon procedures developed in Section 3. As in Section 3, the following are the main steps of the algorithm.
-The first step in solving P PA port−scalable involves determining the initial widths of TAM partitions and the frequencies of TAM partitions. This information is determined based on the maximum scan frequencies for the cores.
-The PA Assign Core procedure is used to make initial core assignments to the TAM partitions. The power constraints are considered while making these initial core assignments to ensure that the test power never exceeds P max . -Finally, we explain the iterative descent procedures that iteratively reduce the test time while at the same time monitoring the test power for the SoC. These iterative descent procedures, as in Section 3, jointly optimize the TAMpartition widths, TAM frequencies, and the assignment of cores to TAM partitions.
Initial Solution: P PA port−scalable
The first step in solving P PA port−scalable remains the same as the TAM Initialize procedure used in Section 3. The TAM Initialize procedure creates B TAM partitions and assigns frequencies to these TAM partitions.
The next procedure PA Assign Core assigns each core to one of the B TAM partitions such that each core is tested at a frequency lower than or equal to its maximum scan frequency. The following are the steps adopted in this procedure.
(1) While not all cores have been assigned to a TAM partition, find a TAM partition TPART min with the lowest testing time. (2) From all cores with maximum scan frequencies greater than the frequency of TPART min , find a core C with maximum test time on TPART min . (3) Check whether the constraint on power consumption for the SoC P max is violated due to the addition of core C with a power consumption of P C ( f TPART min ). (4) If P max is exceeded for the core assignment, remove C from the list of cores with maximum scan frequencies greater than TPART min . (5) Repeat step 2 to determine a core with maximum test time on TPART min .
The aforesaid sequence of procedures is repeated until all cores in the SoC have been assigned to a particular TAM partition. Exceptions are handled in the same way as described in Section 3.1.
Iterative Descent Procedures: P PA port−scalable
We now present the three IDP components used to solve P PA port−scalable .
In the Split TAM procedure, the TAM frequencies, TAM widths, and core assignments based on the initial solution are optimized. The Split TAM displaces the frequency-bottleneck cores from those TAM partitions with maximum testing time, and places them in a separate TAM partition. The poweraware Split TAM IDP is similar to the Split TAM procedure presented in Section 3.2. In the power-aware procedure, we check for satisfiability of the power constraint P max when a frequency-bottleneck core is assigned to a new TAM partition. We exit the procedure if no power-compatible frequency-bottleneck cores are found on TPART max .
The Core Shuffle IDP jointly optimizes the core assignments and the scan frequencies of TAM partitions by shuffling the core assignment to TAM partitions. The following are the main steps used in this procedure.
(1) The first three steps in Core Shuffle remain the same as from Section 3.2.
(2) We now select that core, from the set of compatible cores, which has a maximum testing time on TPART max that can be assigned to TPART min . The choice of core should not cause the test time to exceed the initial test time of TPART max (from step 1), and the constraint on maximum test power P max should not be violated. The final IDP that we use to solve P PA port−scalable is the Redistribute TAM IDP. We do not use the Merge TAM and Redistribute TAM IDP from Section 3.2 to solve P PA port−scalable because these procedures will alter the power characteristics of the TAM and will introduce power violations in the test schedule.
Experimental Results on Power-Aware Test Planning
We present experimental results for three ITC'02 SoC test benchmark circuits in Table III . We compare the results for the power-aware heuristic approach with those obtained using the heuristic presented in Section 3 to solve P port−scalable , and a baseline case where cores are tested at their maximum scan frequencies. We use power information (and the units for the power values) for the cores from Samii et al. [2006] ; power data for only the three SoCs considered in this section was presented in Samii et al. [2006] . In Samii et al. [2006] , a cycle-accurate power modeling approach was developed for core-based SoCs. For a known TAM width we utilize the data in Samii et al. [2006] to determine the peak power consumption for the core over all clock cycles. We use this value of peak power consumption in our experiments. The value of f * i is 10 MHz in all our experiments.
The experimental results show that the test times P T A for P PA port−scalable are higher than those obtained for P port−scalable (T ), but significantly lower than the baseline case (T b ) for most values of W and P max . On average, the test time obtained using P PA port−scalable is 11.24% lower than the baseline scenario (over all power constraints). The maximum values of power consumption during test obtained using P port−scalable (P PS ) and the baseline scenario (P b ) are listed in Table III. It is clear from the values of power consumption that P port−scalable results in higher power consumption than P PA port−scalable ; this would result in violation of the permissible power envelope during test. The baseline scenario results in 14.56% higher power consumption than P port−scalable on average, over all benchmark circuits. It also results in higher power consumption than P PA port−scalable in all cases.
• 53:17 
