Test Generation and Scheduling for a Hybrid BIST Considering Test Time
  and Power Constraint by Sadredini, Elaheh et al.
  
Test Generation and Scheduling for a Hybrid BIST 
Considering Test Time and Power Constraint 
 
 
 
Abstract— This paper presents a novel approach for test 
generation and test scheduling for multi-clock domain SoCs. A 
concurrent hybrid BIST architecture is proposed for testing cores. 
Furthermore, a heuristic for selecting cores to be tested 
concurrently and order of applying test patterns is proposed. 
Experimental results show that the proposed heuristics give us an 
optimized method for multi clock domain SoC testing in 
comparison with the previous works. 
Keywords-SoC hybrid BIST; power constraint testing;test 
scheduling; test generation; DFT 
I. INTRODUCTION  
y increases in complexity of system on chip (SoC) designs, 
user defined logics and the number of cores in SoCs, 
achieving a reasonable test time becomes very important. 
Therefore, different approaches have been used to reduce the total 
SoC test application time. One approach is based on allocating 
optimal test access mechanisms (TAMs) and test scheduling 
algorithms for testing cores [1-4][14][23]. One of the most 
important factors to improve test application time (TAT) in SoC 
testing is a suitable test scheduling algorithm based on the 
appropriate test architectures [4-7][24]. In [7, 11, 26] the authors 
proposed an algorithm based on algebra and genetic algorithm to 
find an optimal test scheduling. On the other hand, because of the 
increasing switching activities during the test process, power 
consumption of a chip while being tested, is higher than the normal 
mode of operation and that can affect the reliability of testing [8]. 
Thus, power constraint and thermal aware SoC testing have become 
important during concurrent testing [9, 10, 12, 13, 25].    
Many test scheduling schemes and algorithms have been 
proposed to minimize SoC test time utilizing concurrent approaches 
with power limitation consideration. In [14, 15, 27] the authors 
presented a heuristic method to minimize test application time by 
allocating optimum wires of TAM to each core and best test 
scheduling scheme. In [16, 20] the authors presented a power aware 
bus architecture and an appropriate algorithm to use a functional bus 
instead of TAM to test cores concurrently. The main idea is to use 
memory buffers for applying deterministic test patterns (DTPs) to 
cores concurrently when the speed of the functional bus is higher 
than the speed of injecting test patterns in cores. In [21] an 
architecture and a test scheduling method for testing cores with 
multi-clock domain SoC is presented. On the other hand, when 
talking about concurrency, built-in self-test (BIST) becomes a 
beneficial method to achieve concurrency. In [17] the authors 
proposed an algorithm to combine BIST with external testing 
(called hybrid BIST or CBET) to achieve optimum result for test 
time minimization problem. In [10] the authors proposed an 
algorithm based on solving rectangular packing problem for power 
constrained concurrent hybrid BIST for SoC testing. In this paper 
we propose a hybrid BIST structure and concurrent test scheduling 
for multi-clock domain SoC. Furthermore a novel algorithm for 
finding the optimal number of pseudo random test patterns (PRTPs) 
and deterministic test patterns for each core is presented.  
Section II is devoted to discuss test generation process in hybrid 
BIST. A hybrid BIST architecture for a multi-clock domain SoC is 
proposed in Section III. In Section IV, a test scheduling graph for 
modeling power aware core testing is presented. Based on the 
proposed test scheduling graph, some heuristics for selecting the set 
of cores to be tested concurrently, determining sets and the order of 
applying deterministic and pseudo random test patterns to each core 
will be discussed in Section V. An algorithm for test scheduling 
based on the test scheduling graph is proposed in Section VI. The 
results obtained by the proposed methods are drawn in Section VII. 
Finally Section VIII concludes our work. 
II. CALCULATING NUMBER OF PSEUDO RANDOM TESTS  
This section discusses an approach for calculating pseudo 
random tests to be applied to the hybrid BIST. As mentioned, this 
BIST uses DTPs (Deterministic Test Patterns) and PRTPs (Pseudo 
Random Test Patterns). Both DTPs and PRTPs have advantages and 
disadvantages. DTPs are generated for random-resistant faults and 
increase fault coverage more than PRTPs. But, in many cases, an 
automatic test equipment (ATE) is needed for applying test patterns 
with global clock and through scan chains. So, using DTPs increase 
the TAT. On the other hand, PRTP can be produced through BIST 
architectures with local clocks [17]. In some BIST architectures, 
scan chain is not needed that reduces the number of cycles for 
applying test patterns. In general, the speed of PRTPs application 
(through a BIST architecture) is higher than the speed of DTPs 
(through an external ATE). We refer to SBi=FBi/ACBi and 
SEi=FEi/ACEi as BIST speed and external speed of core i 
respectively. Here F is the clock frequency and AC is the number of 
cycles for applying each test pattern. 
In a hybrid BIST method, both the speed of PRTPs and quality 
of DTPs will be used for achieving the minimum TAT. In the 
common hybrid BIST test generation, first, PRTPs are generated for 
detecting easy-to-detect faults and then, for the remaining faults 
(random-pattern-resistant faults), DTPs will be generated. In the 
end of the test generation process, DTPs are generated for specific 
faults. Quality of these vectors is very high, and for achieving full 
coverage, we should apply these test vectors to the CUT. With 
applying these vectors in the start of test generation, a large number 
of faults will be detected, which leave a relatively few fault for the 
PRTPs to detect. Thus reducing the number of PRTPs significantly. 
In our method a PRTP test generation is sandwiched between two 
DTPs as discussed below. Generally, there are 3 stages for the test 
generation process. 
The first stage of test generation is phase 1 of deterministic test 
generation (DTPs-phase1). Such test vectors are generated for very-
hard-to-detect faults and should be applied early in the test process. 
A Large number of faults (easy-to-detect and hard-to-detect) will 
be detected with these high quality tests. 
B 
Elaheh	Sadredini, Mohammad	Hashem	Haghbayan, Mahmood	Fathy,	and Zainalabedin	Navabi 
es9bt@virginia.edu, hashem@cad.ut.ac.ir, mahfathy@iust.ac.ir, navabi@cad.ut.ac.ir 
 
  
The second stage of test generation is the pseudo-random one. 
PRTPs are generated by linear feedback shift registers (LFSR) and 
detect the remaining easy-to-detect faults. The problem is finding 
the optimal number of PRTPs for achieving a minimum TAT for 
the remaining faults. A simple binary search algorithm helps us find 
the optimal number of PRTPs (Figure 1). This algorithm will be 
discussed later. 
The third stage of test generation is phase 2 of deterministic test 
generation (DTPs-phase2). Such tests are generated for remaining 
faults. Finally, full coverage will be achieved. 
The total TAT decreases by applying the above three phases of 
test generation process, compared with hybrid BIST tests where 
deterministic tests follow pseudo random tests. Figure 2 shows test 
application time for the proposed test generation method 
(checkered-dark-hatched bars) and work presented in [22] (gray-
white bars). In our proposed method, PRTP_PM is the optimal 
number of PRTPs and PRTP_PW is the optimal number of PRTPs 
for previous work of [22]. 
Find optimal NPRTP 
1 Inputs: max NPRTP, min NPRTP 
2 IF TAT (max NPRTP) < TAT (min NPRTP) 
3     Report max NPRTP as optimal NPRTP; 
4     Return; 
5 END IF 
6  T1 = Calculate TAT for min NPRTP; 
7  T2 = Calculate TAT for (max NPRTP - min NPRTP)/4; 
8  T3 = Calculate TAT for (max NPRTP - min NPRTP)/2; 
9  T4 = Calculate TAT for 3×(max NPRTP - min NPRTP)/4; 
10 T5 = Calculate TAT for max NPRTP; 
   // the two above calculation will be used for 
   // evaluating the following IF statement. 
11 IF T5 > T4 > T3 
12     Find optimal NPRTP between  
       Min NPRTP and (max NPRTP - min NPRTP)/2; 
13 ELSE 
14     Find optimal NPRTP between 
       (max NPRTP – min NPRTP)/2 and max NPRTP; 
15 END IF 
   END 
Figure 1. Finding optimal NPRTP in hybrid BIST. 
Calculation of TAT in hybrid BIST: The overall TAT, is the 
addition of PRTPs application time and DTPs (phase 1 and 
phase 2) application time. 
NPRTP shows the number of PRTPs and NDTP shows the 
number of DTPs. 
Finding optimal NPRTP in hybrid BIST: The process of 
finding the optimal number of PRTPs is illustrated in Figure 1. 
The function shown in this figure starts finding the optimal NPRTP 
between zero NPRTP and a maximum NPRTP as minimum and 
maximum, respectively (inputs of the function, Line 1). Then, 
TAT for min NPRTP and max NPRTP will be calculated. After that, 
TAT for three NPRTP between min and max NPRTP will be 
calculated (Line 6-10) to determine the optimal NPRTP is at the 
right of the (max NPRTP – min NPRTP)/2 or left. Then the same 
process continues by assigning the middle place as the min or 
max NPRTP. This process continues until finding the optimal 
NPRTP. The algorithm is very similar to binary search for finding 
the minimum TAT. The order of the algorithm is:  O(log< 𝑛) ∗ (𝑂 𝐹𝑎𝑢𝑙𝑡	𝑆𝑖𝑚𝑢𝑙𝑎𝑡𝑖𝑜𝑛 + 𝑇𝑒𝑠𝑡	𝐺𝑒𝑛𝑒𝑟𝑎𝑡𝑖𝑜𝑛 ), 
Where n is the maximum NPRTP used at the start of the 
algorithm. 
III. CONCURRENT HYBRID BIST IN SOCS 
The architecture of concurrent hybrid BIST, used in this 
paper, is shown in 2.  
TAM or Functional Bus
Core 1 Core 2 Core 3
Core 4Core 5Core 6
BISTBISTBIST
BIST BIST BIST
 
Figure 3. Concurrent Hybrid BIST architecture. 
Each core in the SoC shown in Figure 3 has a BIST 
architecture and a TAM or a functional bus ([16, 20]) which can 
be used to apply external test patterns. We use this bus for the 
deterministic tests, while pseudo random tests are generated 
internal to the core. If we use 𝑣𝑝RS for the kth set of PRTPs for 
the built-in self-test part of core i, and 𝑣𝑑RS  for the kth set of 
DTPs for the external test part of core i, then: 𝑣RS = 𝑣𝑝RS ∪ 𝑣𝑑RS																							(3.1) 𝐹Z[\ = 𝐹Z][\ ∪ 𝐹Z^[\																				(3.2) 𝑇 𝑣RS = 𝑇 𝑣𝑝RS + 𝑇 𝑣𝑑RS 		(3.3) 
In the above equations, 𝑣RS	is the set of test patterns that is 
applied to core i, 𝐹Z[\  is the set of detected faults by 𝑣RS . 𝑇 𝑣RS 	is the test time that remains for application of remaining 
vectors in the test set for core i. Thus, in the above, the remaining 
test time is what remains of PRTPs and DTPs.
 
Figure 2. TAT for proposed method (PM) and work presented in [22].
  
IV. PEAK POWER LIMITATION AND TEST SCHEDULING  
Many works have been done on power constraint SoC test 
scheduling and, also SoC hybrid BIST. The main contribution in 
this paper regarding to previous works is proposing a new test 
generation algorithm for a hybrid BIST architecture, and a test 
scheduling algorithm based on the generated test patterns. This 
section presents an algorithm for selecting cores that can be 
tested concurrently, and a test scheduling graph.  
As mentioned before, in a hybrid BIST architecture, the total 
peak power limitation should be considered while cores are 
being tested concurrently. It is obvious that power consumption 
varies over time, but to simplify the test scheduling process, it is 
assumed that power consumption of each core is the same as its 
peak power consumption all over the test process [19]. For the 
rest of the paper, peak power consumption of core i is 
represented by Pmi, and Pmax is the maximum power limit of the 
SoC. 
For presenting the test scheduling algorithm, example of five 
cores shown in Table 1. This table shows the characteristics of 
five cores that needs to be available at the start of the algorithm 
that calculates the set of cores that can be tested simultaneously. 
The parameters of this algorithm are: peak power consumption 
of core i (Pmi), time for applying DTPs (T(vd)), and total time for 
applying PRTPs (T(vp)).  
For example Core 1 in Table 1 has 100uw test peak power, 
needs 300us for applying all deterministic test patterns using full 
TAM wires, and needs 200us for applying its PRTPs through 
BIST architecture. We will use this example for some 
definitions.  
Table1. Characteristics of the cores for an SoC example 
Core 1 2 3 4 5 
Pm 100 200 50 200 50 
T(vd) 300 400 500 150 100 
T(vp) 200 500 200 600 300 
 
For applying the proposed algorithm, we propose a test 
scheduling graph model in this section. The following 
definitions are used in presentation of the scheduling graph.  
Definition 1 (power group sets or Nk sets): Each node of the 
SoC test scheduling graph is a set of cores that can be tested 
concurrently and no other core can be added to this set (we call 
these sets, power group sets or Nk sets). For example for SoC of 
Table 1 with 5 cores, and given power characteristics (Pm), 
assuming that Pmax=300, then N1 can be {1, 2} and N2 can be {1, 
3, 5}. 
Definition 2 (MSoC): The set of all Nk sets in an SoC is called 
MSoC. For example for the SoC of Table 1, MSoC={{1, 2}, {1, 3, 
5}, {1, 4}, {2, 3, 5}, {3, 4, 5}}. 
The algorithm of Figure 4 finds MSoC for an SoC. This 
algorithm is a recursive algorithm where inputs are the power 
characteristic for each core of the SoC, peak power upper bound 
of the SoC, and an Nk set. First, the algorithm initializes MSoC 
with an empty set. Then, core i with Pmi is added to Nk, if the 
total power for Nk does not exceed Pmax. Then, the algorithm calls 
itself with a new peak power upper bound (Pmax-Pmi) and Nk 
(Lines 5). Finally, if there is no core in the SoC that can be added 
to Nk, the generated Nk will be added to MSoC.  
Definition 3 (incomplete sets): set of cores of an SoC where 
overall test power is lower than Pmax, but there exist some other 
cores in SoC that can be tested with them, and yet the overall 
test power remains below the peak power (Pmax). For example, 
{3, 5} is an incomplete set for the SoC of Table 1, because Core 
1 can be added to this set and the overall power will not exceeds 
Pmax. 
By defining Nk sets, MSoC, and incomplete sets, now we are 
able to define the test scheduling graph. 
Definition 4: Test scheduling graph for an SoC is T = (G, 
E) in which G is the set of graph nodes. Each node of the graph 
corresponds to an Nk set or an incomplete set. E⊂G2 is the 
edges of the graph modeling the switching between two Nk sets. 
Each Node k in the graph has a time that will be shown by Tk. 
 We use test scheduling graph for modeling the concurrent 
testing of cores in a hybrid BIST architecture. For example 
Figure 5 depicts a test scheduling graph for applying all DTPs 
and PRTPs of cores with characteristic shown in Table 1.  
Figure 4. Algorithm for finding power group sets. 
In each Nk set, cores in the set are tested concurrently. When 
the BIST or the Externaltest part of a core, like Ni, in a group of 
cores is complete, that core is released from BIST or External 
test part. Each node is labeled by its time duration, i.e., Tnode. TI 
corresponds to the time of the incomplete node I. 
The test scheduling graph is performed considering power 
consumption of each core. Each node handles timing of BIST or 
External test part of the cores selected based on Nk. Selection 
between BIST or External test part of a core in a node depends 
on the core’s T(vp) and T(vd), and will be discussed in the next 
section. This example only shows transitions from one node to 
another. 
According to the test scheduling graph shown in Figure 5, 
first, the BIST part of Core 1 and the External test part of Core 
2 are tested concurrently in Node 1. We move to Node 2 when 
the BIST part of Core 1 finishes (T(vp)=0). In this case, the BIST 
part of Core 1 is released from Node 1 leaving 200 (400-200) of 
T(vd) of Core 2 to be handled by the next node. Then,the External 
test part of Core 2 and BIST parts of Core 3 and Core 5 are tested 
in Node 2. After 200 cycles, External test part of Core 2 and 
BIST part of Core 3 are released from Node 2, leaving 200 of 
Inputs: 1) core characteristics (SoC) 
        2) Maximum power limit (Pmax) 
        3) A power group set (Nk) 
Output: All Nk sets (MSoC) 
 
Finding MSoC(core characteristics, Pmax) 
1  MSoC={}; 
2  FOR each core i in SoC that is not in Nk DO 
3    IF (Pmi<Pmax) THEN 
4      Include core i in Nk; 
5      Finding MSoC(core characteristics, Pmax-Pmi, Nk); 
6    END IF 
7    IF (for each core i in SoC that is not in Nk, 
     Pmi<Pmax) THEN 
8      Add Nk to MSoC; 
9    END IF 
10  END FOR 
   END 
 
  
BIST part of Core 5. BIST parts of Core 2 and Core 5 and 
External test part of Core 3 are tested concurrently in Node 3. 
The BIST part of Core 5 is released from Node 3. In I1 (the next 
node in the graph shown in Figure 5), the BIST part of Core 2 
and External test part of Core 3 are tested concurrently. Note 
that nodes labeled by I refers to incompletesets, i.e., Definition 
3, because {2, 3}	∉	MSoC. 
External test part of Core 1 and BIST part of Core 4 are 
tested concurrently in Node 4 and the External test part of Core 
1 is released from Node 4. Finally the remaining test of cores 
will be finished by I2, I3 and I4. The total test time for this test 
scheduling graph is: 𝑇𝑖𝑚𝑒 = 𝑇SS∈cd + 𝑇e\S 						(4.1) 
1
4
2
I2
N1={1, 2}
T1=200
3
I3
I1
I4
T(vd1)=300,T(vp1)=200
T(vd2)=400,T(vp2)=500
T(vd3)=500,T(vp3)=200
T(vd4)=150,T(vp4)=600
T(vd5)=100,T(vp5)=300
T(vd1)=300,T(vp1)=0
T(vd2)=200,T(vp2)=500
T(vd3)=500,T(vp3)=200
T(vd4)=150,T(vp4)=600
T(vd5)=100,T(vp5)=300
T(vd1)=300,T(vp1)=0
T(vd2)=0,T(vp2)=500
T(vd3)=500,T(vp3)=0
T(vd4)=150,T(vp4)=600
T(vd5)=100,T(vp5)=100
T(vd1)=300,T(vp1)=0
T(vd2)=0,T(vp2)=400
T(vd3)=400,T(vp3)=0
T(vd4)=150,T(vp4)=600
T(vd5)=100,T(vp5)=0 T(vd1)=300,T(vp1)=0T(vd2)=0,T(vp2)=0
T(vd3)=0,T(vp3)=0
T(vd4)=150,T(vp4)=600
T(vd5)=100,T(vp5)=0
T(vd1)=0,T(vp1)=0
T(vd2)=0,T(vp2)=0
T(vd3)=0,T(vp3)=0
T(vd4)=150,T(vp4)=300
T(vd5)=100,T(vp5)=0
T(vd4)=150,T(vp4)=200
T(vd5)=0,T(vp5)=0
T(vd4)=0,T(vp4)=200
N2={2, 3, 5}
T2=200
N3={2, 3, 5}
T2=100
EI1={2, 3}
TI1=400
EI2={4, 5}
TI2=100
EI4={4}
TI4=200
EI3={4}
TI3=150
Figure 5.An example test scheduling graph. 
According to the proposed architecture in Section III, we 
should have at most one External test part while using the 
functional bus for applying deterministic test patterns (cores 
tested externally are underlined in Figure 5). Because of high 
time penalty of pausing External test parts (due to state saving), 
we choose not to pause external tests while a core is being tested 
through TAM or functional bus. In the following sections, based 
on the proposed architecture and assumptions, we will discuss 
our algorithm to find an optimal test scheduling graph. We use 
this algorithm to reduce TAM as much as possible. 
V. ALGORITHM FOR TEST SCHEDULING 
        Figure 6 shows the proposed test scheduling algorithm based 
on test scheduling graph. The algorithm gets core characteristics 
and peak power consumption of an SoC. First, test generation 
according to algorithm shown in Figure 1 generates optimal 
DTPs and PRTPs for each core (Line 1). After that, the MSoC will 
be generated according to algorithm shown in Figure 4 (Line 2). 
The algorithm sorts all Nk sets from MSoC by assigning a weight 
to them. This weight can help us decide which Nk set should be 
selected first (Line 3 in Figure 6). Based on a weighted Nk sets, 
i.e. Weighted_MSoC in Figure 6, the algorithm selects Nk sets by 
starting from the highest weight, to make test scheduling graph, 
TSG. This continues until all BIST parts and deterministic parts 
for all cores are covered (Line 4). After making the TSG, the 
algorithm adds some cores to incomplete nodes to be 
tested by their BIST part and generates new DTPs based 
on the added PRTPs in their BIST part (Lines 5-10). For 
example, consider again I1 in Figure 5. As mentioned, {2, 
3} does not belong to MSoC, but {2, 3, 5} does. Then we 
can add BIST part to Core 5, by generating more PRTPs 
for Core 5, and including this BIST part to I1. By adding 
extra PRTPs to Core 5 new DTPs should be generated. It is 
obvious that if we add some PRTPs to test a core, the 
generated number of deterministic test patterns decreases. 
After that, the algorithm updates the weighs of Nk sets based on 
the newly generated DTPs and PRTPs. This process continues 
until exhausting all incomplete nodes.  
Figure 6. Proposed algorithm for power constraint test scheduling. 
For giving a weight to Nk sets (WNk), the following 
Heuristics are helpful:  
Heuristic 1. To select Nk sets for a test scheduling graph, 
sets of cores with longer External test parts are given a higher 
priority. So the weight for an Nk set depends on the time of the 
longest External test part among the cores of the Nk set, 
max ( ).
k
ii N
T vd
Î
 
Reasoning. This is due to the fact that a pause for External 
test parts has more time overhead for global ATE that pausing 
BIST parts can be done.  
Heuristic 2. As mentioned, to select an Nk set, set of cores 
with the longest External test part is better to be selected first 
(Heuristic 1). The average of BIST parts of other cores in this set 
should be the largest of all available sets. So if we call ek the core 
with the longest External test part in Nk, the weight for Nk set 
depends on the average of T(vpe), such that 𝑒 ∈ 𝑁R and	𝑒 ≠ 𝑒R. 
Reasoning. It can be observed that a combination of long 
BIST parts with long External test part, in an Nk set can result in 
more concurrency. Then, the average of BIST parts in each Nk 
set should be high. 
Based on the above heuristics we have: 
Inputs: 1) core characteristics (SoC) 
        2) Maximum power limit (Pmax) 
Output: Optimal test scheduling graph (TSG) 
 
Test_Scheduling(SoC, Pmax) 
1  core_characteristics = test_generation(SoC); 
2  MSoC = Finding MSoC(core_characteristics, Pmax); 
3  Weighted_MSoC=Assign a weight to each Nk set in 
   MSoC; 
4  TSG = scheduling_TSG(Weighted_MSoC); 
5  WHILE (there is incomplete node in TSG)DO 
6    Add possible BIST part to each incomplete node;  
7    Generate new DTPs for based on added PRTPs; 
8    Weighted_MSoC=Assign a weight to each Nk set in 
     MSoC; 
9    TSG = scheduling_TSG(Weighted_MSoC); 
10 END 
11 RETURN TSG; 
  END 
 
  
𝑊j[ ∝ 𝑚𝑎𝑥S∈j[ 𝑇 𝑣^S 																																	(5.1) 𝑊j[ ∝ 𝑇 𝑣]S𝑁R − 1Sop[S∈j[ 																																	(5.2) 𝑊j[ = 	𝑚𝑎𝑥S∈j[ 𝑇 𝑣^S × 𝑇 𝑣]S𝑁R − 1Sop[S∈j[ 					(5.3) 
The set with the highest weight will be selected as the node 
of the test scheduling graph. Within that node, the core with the 
highest value of T(vd) will be selected to be tested externally. 
After selecting an Nk, the weight of all remaining sets should be 
reevaluated according to the test parts covered in the selected Nk. 
If all External test parts of all cores are completed, or if it is not 
allowed to pause the External test part of the core that is being 
tested externally in the recently selected Nk set, calculation of 𝑊j[changes from that shown in Equation 5.3 to Equation 5.4 
because in this case we do not have any External test part. 
𝑊j[ = 	 𝑇 𝑣]S𝑁RS∈j[ 																(5.4) 
VI. EXPERIMENTAL RESULTS 
For the experimental results, we used cores of MCDS1 from 
[21] that is a version of d695 from ITC02 benchmark. The 
problem for comparing our results with available methods is 
availability of the gate level details of the cores. Then, d695 is 
the best choice because its cores are from ISCAS benchmarks. 
The characteristic of the SoC, i.e., power, frequency in each 
domain, etc., is exactly the same as MCDS1 [21]. The 
ATALANTA test generator is used for determining DTPs for 
External test part of our hybrid BIST.  
The optimal NPRTP and NDTP obtained by the proposed 
algorithm of Figure 1 for several d695 cores are shown in Table 
2. This information will be used for initializing the SoC test 
scheduling process. We used ATALANTA for generating DTPs, 
an LFSR for generating PRTPs, and a parallel fault simulator. 
The clock frequency for External test part is 100MHz. PRTPs 
are applied in one clock cycle. Using scan chain for applying 
DTPs, the addition of primary inputs and pseudo primary inputs 
determine the number of clock cycles for applying each DTP. 
Finally, our proposed method test cycles is computed from 
Equation 6.1. PMTC	=	 𝑃𝑀𝐷𝑉× 𝑃𝐼𝑠 + 𝑃𝑃𝐼𝑠 + 𝑜𝑝𝑡_𝑃𝑅𝑇𝑃 																						(6.1)	
In this equation, PMDV is the total number of selected 
deterministic test vectors, opt_PRTP is the number if pseudo 
random tests. PIs show the number of primary inputs and PPIs 
are number of pseudo primary inputs of the circuit. 
Table 2.  Optimal number of applying PRTPs and DTPs 
cores	 SB/SE 
NDTP	
phase1 
NPRTP	
PRTPs 
NDTP	
phase2 Test	time	(μs) 
C6288	 32 2	 42	 0	 1.06	
C7552	 207 10	 2482	 94	 240.1	
S838	 67 23	 357	 38	 44.44	
S5378	 214 10	 1953	 73	 197.15	
S9234	 247 58	 3960	 127	 496.55	
S13207	 700 89	 8934	 48	 1048.34	
S15850	 611 115	 959	 166	 1726.5	
S35932	 1763 6	 225	 0	 108.03	
S38417	 1664 162	 9834	 353	 8667.94	
S38584	 1464 55	 4573	 180	 3486.13	
The results of the obtained number of the sets for MSoC for 
some SoCs according to the algorithm of Figure 4 are shown in 
Table 3. We used d695 benchmark, MCDS1 from[21], and 
hCAD01 from [20] to show the number of sets and CPU time of 
the proposed algorithm.  
Table 3.Number of sets of MSoC/CPU Time 
	 TAM	 Pmax=1500	 Pmax=2000	 Pmax=3000	
|Md695|	
Un	Lim	 29/1ms<	 54/2ms	 126/8ms	
32pin	 10/1ms<	 8/1ms	 8/1ms	
64pin	 29/1ms<	 43/2ms	 29/3ms	
128pin	 29/2ms	 54/3ms	 125/8ms	
|MMCDS|	
Un	Lim	 88/4ms	 227/9ms	 767/81ms	
32pin	 17/2ms	 13/1ms	 13/2ms	
64pin	 64/3ms	 101/5ms	 96/8ms	
128pin	 88/5ms	 227/11ms	 669/60ms	
	 TAM	 Pmax=3000	 Pmax=4500	 Un	Lim	
|MhCADT01|	
Un	Lim	 6/1ms	 11/1ms	 1/1ms<	
32pin	 6/1ms	 13/1ms	 19/2ms	
64pin	 6/1ms	 11/1ms	 1/1ms<	
128pin	 6/1ms	 11/1ms	 1/1ms<	
The results of TAT for the proposed method are shown in 
Table 4 and Table 5, while using a fixed clock for ATE and 
different peak power and TAM width limitations. For 
comparison, there was a miss match between the number of 
generated test patterns by ATALANTA for full coverage and the 
number of test patterns reported in ITC02 benchmarks for each 
core. For example, for core "c6288" (that is the first core of 
"d695") ATALANTA generates 33 DTPs for full coverage, 
while 12 TPs are reported in ITC02 benchmark. So, we 
implemented the proposed method in [21] with the new number 
of test patterns generated by ATALANTA for full coverage.  
The results are categorized by different peak power 
limitation, different number of TAM sizes, and fixed or flexible 
number of scan chains for each core. The fixed number of scan 
chains wereobtained from those reported in the ITC02 
benchmark. In the flexible number of scan chains, no limitation 
is considered for determining the number of scan chains for each 
core. As shown, a considerable TAT reduction is obtained by the 
proposed method in comparison with [21] and [22]. 
VII. CONCLUSIONS ANDFUTUREWORKS 
In this paper, a concurrent Hybrid BIST method for 
reducing SoC test time is proposed. The most important 
constraint is the power limitation of the chip. An algorithm to 
find the most suitable cores to be tested concurrently is 
proposed. During the test process, applying Deterministic Test 
Patterns and Pseudo Random Test Patterns can be done together 
to reduce the test time. Experimental results show that using this 
method provides a considerable reduction in test application 
time compared to the previous methods for power constrained 
  
testing. This process can be applied for cycle accurate power 
model that improves the TAT. 
VIII. REFERENCES 
[1] Y. Huang, W. T. Cheng, C. C. Tsai, N. Mukherjee, O. Samman, Y. 
Zaidan, S. M. Reddy, "Resource Allocation and Test Scheduling for 
Concurrent Test of Core-based SOC Design”, IEEE Asian Test 
Symposium (ATS), pp. 265-270), 2001. 
[2] E. Larsson, and Z. Peng, “An Integrated Framework for the Design and 
Optimization of SOC Test Solutions”, Journal of Electronic Testing; 
Theory and Applications (JETTA), Vol. 18, No. 4/5, pp. 385-400, 2002.  
[3] V. Iyengar, K. Chakrabarty, and E. J. Marinissen, “Test Access 
Mechanism Optimization, Test Scheduling, and Test Data Volume 
Reduction for System-on-Chip”, IEEE Transactions on Computer, Vol. 
52, No. 12, Dec. 2003.  
[4] E. Larsson, K. Arvidsson, H. Fujiwara, and Z. Peng, “Efficient test 
solutions for core-based designs”, IEEE Transactions on CAD, Vol. 23, 
No.5, pp.758-775, 2004. 
[5] T. Yoneda, M. Imanishi, and H. Fujiwara, “Interactive presentation: an 
SoC test scheduling algorithm using reconfigurable union wrappers”, Pro. 
Design, Automation and Test in Europe, 2007, pp. 231-236. 
[6] X. Chuan-pei, D. Kui, “The Optimization of Hierarchical SOC Test 
Architecture to Reduce Test Time”,International Conference on 
Electronic Packaging Technology & High Density Packaging (ICEPT-
HDP 2008), pp. 1-4, 2008. 
[7] J. Shao, G. Ma, Z. Yang, R. Zhang, “Process Algebra Based SoC Test 
Scheduling for Test Time Minimization”, IEEE Computer Society Annual 
Symposium on VLSI, isvlsi, pp.134-138, 2008. 
[8] B. Pouya and A. Crouch,“Optimization trade-offs for vector volume and 
test power”, International Test Conference (ITC), pp. 873-881, 2000. 
[9] Z. He, Ze. Peng, P. Eles, P. Rosinger, B. M. Al-Hashimi, “Thermal-Aware 
SoC Test Scheduling with Test Set Partitioning and Interleaving”, Journal 
of Electronic Testing, dft, pp.477-485, 21st IEEE International 
Symposium on Defect and Fault-Tolerance in VLSI Systems (DFT'06), 
2006. 
[10] Z. He, G. Jervan, Z. Peng, P. Eles, "Power-Constrained Hybrid BIST Test 
Scheduling in an Abort-on-First-Fail Test Environment," dsd, 8th 
Euromicro Conference on Digital System Design (DSD'05), pp.83-87, 
2005. 
[11] C. Giri, S. Sarkar, and S. Chattopadhyay, “Test Scheduling for Core-
Based SOCs Using Genetic Algorithm Based Heuristic Approach”, 
Lecture Notes in Computer Science, pp.1032-1041, 2007. 
[12] D. Zhao, S. Upadhyaya, “Dynamically partitioned test scheduling with 
adaptive TAM configuration for powerconstrained SoC testing”, IEEE 
Transactions on Computer-Aided Design of Integrated Circuits and 
System, pp.956-965, 2005. 
[13] V. Janfaza, B. Forouzandeh, P. Behnam and M. Najafi, "Hybrid history-
based test overlapping to reduce test application time," East-West Design 
& Test Symposium (EWDTS 2013), Rostov-on-Don, 2013. 
[14] J. Pouget, E. Larsson, and Z. Peng, “Multiple-Constraint Driven System-
on-Chip Test Time Optimization,” Journal of Electronic Testing: Theory 
and Applications, Vol. 21, pp. 599-611, 2005. 
[15] Y. Huang, S. M. Reddy, W. Cheng, P. Reuter, N. Mukherjee, C. Tsai, O. 
Samman, and Y. Zaidan, “Optimal Core Wrapper Width Selection and 
SOC Test Scheduling Based on 3-D Bin Packing Algorithm”, In Proc. 
International Test Conference, pp. 74-82, 2002. 
[16] F. A. Hussin, T. Yoneda, A. Orailoglu, and H. Fujiwara, “power-
constrained SoC test schedules through utilization of functional buses”, In 
Proc. IEEE International Conference on Computer Design, pp. 230-236, 
2006. 
[17] M. Sugihara, H. Date, H. Yasuura, “Analysis and Minimization of Test 
Time in a Combined BIST and External Test Approach”, Design, 
Automation & Test In Europe Conference (DATE 2000), pp. 134-140, 
2000. 
[18]  M. –S. Hsiao, E. M. Rudnick and J. H. patel, “Effects of Delay Models 
on Peak Power Estimation of VLSI Sequential Circuits”, International 
Conference on CAD, pp. 45-51, 1997. 
[19] R.M. Chou, K.K. Saluja, and V.D. Agrawal, “Scheduling Tests for VLSI 
Systems Under Power Constraints”, IEEE Transactions on VLSI Systems, 
vol.5, no. 2, pp. 175–185, 1997. 
[20] F. Azmadi Hussin, T. Yoneda, A. Orailoglu, H. Fujiwara, “Core-Based 
Testing of Multiprocessor System on-Chips Utilizing Hierarchical 
Functional Buses”, ASP-DAC,PP. 720-725, 2007. 
[21] T. Yoneda, K. Masuda, H. Fujiwara, "Power-Constrained Test Scheduling 
for Multi-Clock Domain SoCs", Proceedings of the Design Automation 
& Test in Europe Conference, Vol. 1, pp.68,  2006. 
[22] M. H. Haghbayan, S. Safari, Z. Navabi, "Power Constraint Testing for 
Multi-Clock Domain SoCs Using Concurrent Hybrid BIST", Accepted in 
IEEE Symposium on Design and Diagnostics of Electronic Circuits and 
Systems (DDECS12), Tallinn, Estonia, 2012. 
[23] Sadredini, Elaheh, Mohammadreza Najafi, Mahmood Fathy, and 
Zainalabedin Navabi. "BILBO-friendly hybrid BIST architecture with 
asymmetric polynomial reseeding." In Computer Architecture and Digital 
Systems (CADS), 2012 16th CSI International Symposium on, pp. 145-
149. IEEE, 2012. 
[24] Elahe Sadredini, Reza Rahimi, Paniz Froutan, Mahmood Fathy, and 
Zainalabedin Navabi. "An Improved Scheme for Pre-computed Patterns 
in Core-based SoC Architecture." In Design & Test Symposium 
(EWDTS), IEEE, Armenia, 2016. 
[25] V. Janfaza, P. Behnam, B. Forouzandeh, B. Alizadeh, “A Low-power 
Enhanced Bitmask-dictionary Scheme for Test Data Compression,” 12’th 
IEEE Symposium on Very Large Scale Integration (ISVLSI), USA, 2014. 
[26] Alif Ahmed, Farimah Farahmandi and Prabhat Mishra, “Directed Test 
Generation using Concolic Testing of RTL Models”,  Design Automation 
and Test in Europe (DATE), pages -, Dresden, Germany, March 19 - 23, 
2018. 
[27] Farahmandi, F., Mishra, P., & Ray, S.  “Exploiting transaction level 
models for observability-aware post-silicon test generation”, In Design, 
Automation & Test in Europe Conference & Exhibition (DATE), 2016. 
 
 
 
Table 4. Test application time of Proposed Method (PM (µs)) for fix number of scan chain in comparison with [21] and [22] for multi-clock domain SoCMCDS1. 
FATE	 PMAX	
TAM	width	
32	pin	 64	pin	 128	pin	
TAT	
([21])	
TAT	
([22])	
PM	
TAT	
([21])	
TAT	
([22])	
PM	
TAT	
([21])	
TAT	
([22])	
PM	
TAT	 CPU	Time	 TAT	
CPU	
Time	 TAT	
CPU	
Time	
200MHz	
1500	 2500.09 624.29 474.4604 482 2037.076 621.92 472.65 510 2037.076 620.34 473.4 415 
2000	 1886.68 622.5 473.1 569 1615.59 620.125 471.29 552 1615.59 617.28 470.1 521 
2500	 1879.49 621.23 472.1348 727 1358.592 618.86 470.33 724 1358.592 614.73 467.1 407 
100MHz	
1500	 -	 -	 -	 -	 2500.09 1243.84 945.31 747 2037.076 1240.68 942.9 509 
2000	 -	 -	 -	 -	 1886.682 1240.2	 942.55 498 1615.59 1234.56 938.26 509 
2500	 -	 -	 -	 -	 1879.49 1237.2	 940.27 391 1359.172 1229.46 937.36 353 
50MHz	
1500	 -	 -	 -	 -	 -	 -	 -	 -	 2500.09 1740.5 1422.7 516 
2000	 -	 -	 -	 -	 -	 -	 -	 -	 1886.682 1734.5 1320 478 
2500	 -	 -	 -	 -	 -	 -	 -	 -	 1879.49 1729 1314 521 
 
Table 5. Test application time of Proposed Method (PM (µs)) for flexible number of scan chain in comparison with [22] for multi-clock domain SoCMCDS1. 
  
FATE	 PMAX	
TAM	width	
32	pin	 64	pin	 128	pin	
TAT	([22])	 PM	 TAT	([22])	 PM	 TAT	([22])	 PM	TAT	 CPU	Time	 TAT	 CPU	Time	 TAT	 CPU	Time	
200MHz	
1500	 453 345.28 398 231 185.56 392 120 95.2 394 
2000	 453 345.28 435 231 185.56 304 120 95.2 282 
2500	 453 345.28 392 231 185.56 597 120 95.2 599 
100MHz	
1500	 907.12 689.41 361 463 341.88 396 241 183 390 
2000	 907.12	 689.41	 326	 463	 341.88	 393	 241	 183	 392	
2500	 907.12 689.41 491 463 341.88 264 241 183 398 
50MHz	
1500	 1814 1388.64 553 927 714.52 704 482 376 384 
2000	 1814 1388.64 394 927 714.52 396 482 376 672 
2500	 1814 1388.64 393 927 714.52 499 482 376 385 
 
