Introduction
Today's microelectronics technology provides designers the possibility to integrate a large number of different functional blocks, usually referred as cores, in a single IC. Such a design style allows designers to reuse previous designs and will lead therefore to shorter time-to-market and reduced cost. Such a system-on-chip (SoC) approach is very attractive from the designers' perspective. Testing of such systems, on the other hand, is a problematic and time consuming task, mainly due to the resulting IC's complexity and the high integration density [1] .
To test the individual cores in such systems the test pattern source and sink have to be available together with an appropriate test access mechanism (TAM) [2] . Traditional approaches implement both test source and sink off-chip and require therefore the use of external Automatic Test Equipment (ATE). As the requirements for the ATE speed and memory size are continuously increasing, the ATE solution may be unacceptably expensive and inaccurate. Therefore, in order to apply at-speed tests and to keep the test costs under control, on-chip test solutions are becoming a mainstream technology for testing such complex systems. Such a solution is usually referred to as built-in self-test (BIST).
Different test scenarios are possible while using BIST. Sometimes the embedded cores may be tested using only internally generated pseudorandom test patterns. Due to several reasons, like very long test sequences, and random pattern resistant faults, this approach may not always be efficient. One solution to this problem is to complement pseudorandom test patterns with deterministic test patterns, applied from the on-chip memory or, in special situations, from the ATE. This approach is usually referred to as hybrid BIST [3] .
One of the important parameters influencing the efficiency of a hybrid BIST approach is the ratio of pseudorandom and deterministic test patterns in the final test set. As the amount of resources on the chip is limited, the final test set has to be designed in such a way that the deterministic patterns fit into the on-chip memory. At the same time the testing time must be minimized in order to reduce testing cost and time-tomarket.
There exists extensive work for testing core-based systems. The main emphasis has been so far on test scheduling, TAM design and testability analysis. The earlier test scheduling work has had the objective to determine start times for each test such that the total test application time is minimized. This assumes a fixed set of tests and test resources together with a test access architecture. Some approaches can also take into account test conflicts and different constraints, e.g. power [4] - [11] . However there hasn't been any work to find the optimal test sets for testing every individual core in such a manner that the total system test time is minimized and the different ATE constraints satisfied. Sugihara et al. [8] have addressed the problem of selecting a test set for each core from a set of pre-determined test sets provided by the core vendor and scheduling these tests in order to minimize the testing time. Although this approach can find the best possible selection of tests from a given set, it doesn't provide a mechanism for finding the test set in first place. This paper deals with the problem of core-based system testing where hybrid BIST approach is used. Our earlier work, [3] , [12] and [13] , has been concentrating on test cost calculation and hybrid BIST optimization for single-core designs. In this paper we propose a methodology for test time minimization, under memory constraints, for multi-core systems. We propose an algorithm for calculating the best combination between pseudorandom and deterministic tests, where the memory constraints are not violated, the total test time is minimized, and maximum achievable fault coverage is guaranteed.
Relations between different cost components of the test sets, as functions of the hybrid BIST structure, are introduced to find the optimal solution. To avoid exhaustive search, a method for estimating the cost of the deterministic component in the hybrid test set is introduced. Finally, based on our estimation methodology, we have developed an iterative algorithm to minimize the total length of the hybrid BIST solution under given memory constraints.
The rest of the paper is organized as follows. In Section 2 a hybrid BIST architecture is described and a general problem description is given. Section 3 is devoted to basic definitions, cost functions and problem formulation. Section 4 describes our test cost estimation methodology and the algorithm for test length minimization, based on our estimates is presented in Section 5. Finally, the experimental results are presented in Section 6, and Section 7 concludes the paper together with directions to the future work.
Hybrid BIST Architecture
Recently we have proposed a hybrid BIST optimization methodology for a single core designs [3] . Such a hybrid BIST approach starts with a pseudorandom test sequence of length L. At the next stage, the stored test approach takes place: precomputed deterministic test patterns are applied to the core under test to reach the desirable fault coverage. For off-line generation of the deterministic test patterns, arbitrary software test generators may be used, based on deterministic, random or genetic algorithms.
In a hybrid BIST technique the length of the pseudorandom test is an important parameter that determines the behavior of the whole test process. It is assumed here that for the hybrid BIST the best polynomial for the pseudorandom sequence generation will be chosen. By using the best polynomial, we can achieve the maximal fault coverage of the CUT. In most cases this means that we can achieve 100% fault coverage if we run the pseudorandom test long enough. With the hybrid BIST approach we terminate the pseudorandom test in the middle and remove the latter part of the pseudorandom sequence, which leads to lower fault coverage achievable by the pseudorandom test. The loss of fault coverage should be compensated by additional deterministic test patterns. In general a shorter pseudorandom test set implies a larger deterministic test set. This requires additional memory space, but at the same time, shortens the overall test process, since deterministic test vectors are more efficient in covering faults than the pseudorandom ones. A longer pseudorandom test, on the other hand, will lead to longer test application time with reduced memory requirements. Therefore it is crucial to determine the optimal length L OPT of the pseudorandom test sequence, in order to minimize the total testing cost. Our previously proposed methodology enables us to find the most cost-effective combination of the two test sets not only in terms of test time but also in terms of tester/on-chip memory requirements. The efficiency of such approach has been demonstrated so far for individual cores. In this paper we propose an approach to extend our methodology also for complex systems containing more than one core. We take into account the constraints (memory size) imposed by the system and minimize the testing time for the whole system with multiple cores, while keeping the high fault coverage.
In this paper we assume the following test architecture: Every core has its own dedicated BIST logic that is capable to produce a set of independent pseudorandom test patterns, i.e. the pseudorandom test sets for all the cores can be carried out simultaneously. The deterministic tests, on the other hand, can only be carried out for one core at a time, which means only one test access bus at the system level is needed. An example of a multi-core system, with such a test architecture is given in Figure 1 . This example system consists of 5 cores (different ISCAS benchmarks). Using our hybrid BIST optimization methodology for single core [3] we can find the optimal combination between pseudorandom and deterministic test patterns for every individual core ( Figure 2 ). Considering the assumed test architecture, only one deterministic test set can be applied at any given time, while any number of pseudorandom test sessions can take place in parallel. To enforce the assumption that only one deterministic test can be applied at a time, a simple ad-hoc scheduling can be used. The result of this scheduling defines the starting moments for every deterministic test session, the memory requirements, and the total test length t for the whole system. This situation is illustrated on Figure 2 .
As it can be seen from Figure 2 , the solution where every individual core has the best possible combination between pseudorandom and deterministic patterns usually does not lead to the best system-level test solution. In the example we have illustrated three potential problems:
The total test length of the system is determined by the single longest individual test set, while other tests may be substantially shorter;
The resulting deterministic test sets do not take into account the memory requirements, imposed by the size of the on-chip memory or the external test equipment; The proposed test schedule may introduce idle periods, due to the test conflicts between the deterministic tests of different cores;
There are several possibilities for improvement. For example the ad-hoc solution can easily be improved by using a better scheduling strategy. This, however, does not necessarily lead to a significantly better solution as the ratio between pseudorandom and deterministic test patterns for every individual core is not changed. Therefore we have to explore different combinations between pseudorandom and deterministic test patterns for every individual core in order to find a solution where the total test length of the system is minimized and memory constraints are satisfied. In the following sections we will define this problem more precisely, and propose a fast iterative algorithm for calculating the optimal combination between different test sets for the whole system.
Basic Definitions and Problem Formulation
Let us assume that a system S consists of n cores C 1 , C 2 ,…, C n . For every core C k S a complete sequence of deterministic test patterns TD F k and a complete sequence of pseudorandom test patterns TP F k will be generated. It is assumed that both test sets can obtain by itself maximum achievable fault coverage F max . 
and by the cost of recourses needed for storing the deterministic test sequence TD k in the memory:
The parameters and k can be introduced by the designer to align the application times of different test sequences. For example, when a test-per-clock BIST scheme is used, a new test pattern can be generated and applied in each clock cycle and in this case = 1. The parameter k for a particular core C k is equal to the total number of clock cycles needed for applying a deterministic test pattern from the memory. In a special case, when deterministic test patterns are applied by an external test equipment, application of deterministic test patterns may be up to one order of magnitude slower than applying BIST patterns. The coefficient k is used to map the number of test patterns in the deterministic test sequence TD k into the memory recourses, measured in bits.
Definition 4: When assuming the test architecture described above, a hybrid test set TH = {TH 1 , TH 2 , …, TH n } for a system S = {C 1 , C 2 , …, C n } consists of hybrid tests TH k for each individual core C k , where pseudorandom components of the TH can be scheduled in parallel, whereas the deterministic components of TH must be scheduled in sequence due to the shared test resources. According to Definition 2, for each j k corresponds a pseudorandom subsequence TP k (j k ) TP F k , and according to Definition 1, any pseudorandom test sequence TP k (j k ) should be complemented with a deterministic test sequence, denoted with TD k (j k ), that is generated in order to achieve the maximum achievable fault coverage. Based on this we can conclude that the characteristic vector J determines entirely the structure of the hybrid test set TH k for all cores C k S.
Definition 6: The test length of a hybrid test TH = {TH 1 , TH 2 , …, TH n } for a system S = {C 1 , C 2 , …, C n } is given by:
The total cost of resources needed for storing the patterns from all deterministic test sequences TD k in the memory is given by: 
where the memory costs are directly related to the lengths of all possible hybrid test solutions.
The integrated generic cost function COST M =f(COST T ) for the whole system is the sum of all cost functions
From the function COST M = f(COST T ) the value of COST T for every given value of COST M can be found. The value of COST T determines the lower bound of the length of the hybrid test set for the whole system. To find the component j k of the characteristic vector J, i.e. to find the structure of the hybrid test set for all cores, the equation f T,k (j)= COST T should be solved.
The objective of this paper is to find a shortest possible (min(COST T )) hybrid test sequence TH opt when the memory constraints are not violated COST M COST M,LIMIT .
Hybrid Test Sequence Computation Based on Cost Estimates
By knowing the generic cost function COST M = f(COST T ), the total test length COST T at any given memory constraint COST M COST M,LIMIT can be found in a straightforward way. However, the procedure to calculate the cost functions COST D,k (j) and COST M,k (j) is very time consuming, since it assumes that the deterministic test set TD k for each j = 1, 2, …, TPE F k has to be available. This assumes that after every efficient pattern P j TPE k TP k , j = 1, 2, …, TPE F k a set of not yet detected faults F NOT,k (j) should be calculated. This can be done either by repetitive use of the automatic test pattern generator or by systematically analyzing and compressing the fault tables for each j [13] . Both procedures are accurate but time-consuming and therefore not feasible for larger designs. To overcome the complexity explosion problem we propose an iterative algorithm, where costs COST M,k and COST D,k for the deterministic test sets TD k can be found based on estimates. The estimation method is based on fault coverage figures and does not require accurate calculations of the deterministic test sets for not yet detected faults F NOT,k (j).
In the following we will use FD k (i) and FPE k (i) to denote the fault coverage figures of the test sequences TD k (i) and TPE k (i), correspondingly, where i is the length of the test sequence. Figure 3) . 3. By solving the equation FD k (i) = F*, find the maximum integer value j* that satisfies the condition FD k (j*) F*. The value of j* is the length of the deterministic sequence TD k that can achieve the same fault coverage F* . 
Test Length Minimization Under Memory Constraints
As described above, the exact calculations for finding the cost of the deterministic test set COST M,k = f k (COST T,k ) are very time-consuming. Therefore we will use the cost estimates, calculated by Procedure 1 in Section 4, instead. Using estimates can give us a quasi-minimal solution for the test length of the hybrid test at given memory constraints. After obtaining a quasi-minimal solution, the cost estimates can be improved and another, better, quasi-minimal solution Figure 3 . Estimation of the length of the deterministic test sequence can be calculated. This iterative procedure will be continued until we reach the final solution. M is calculated (Step 3, point 1* in Figure 5 ). As we see in Figure  5 , the value of COST E* M in point 1* violates the memory constraints. The difference t 1 is determined by the curve of the estimated cost (Step 6). After correction, a new value of COST E* T is found (point 2 on Figure 5 ). Based on COST E* T , a new J* is found (Step 2), and a new COST E* M is calculated (Step 3, point 2* in Figure 5 ). An additional iteration via points 3 and 3* can be followed in Figure 5 .
It is easy to see that Procedure 2 always converges. By each iteration we get closer to the memory constraints level, and also closer to the minimal test length at given constraints. However, the solution may be only near-optimal, since we only evaluate solutions derived from estimated cost functions. 
Experimental Results
We have performed experiments with several systems composed from different ISCAS benchmarks as cores. The results are presented in Table 1 .
In Table 1 we compare our approach where the test length is found based on estimates, with an exact approach where deterministic test sets have been found by manipulating the fault tables for every possible switching point between pseudorandom and deterministic test patterns. As it can be seen from the results, our approach can give significant speedup (more than order of magnitude), while retaining acceptable accuracy (the biggest deviation is less than 9% from the exact solution, in average 2.4%). Figure 4 . Cost curves for a given core C k
In Figure 6 we present the estimated cost curves for the individual cores and the estimated and real cost curves for the system S2. We also show in this picture a test solution point for this system under given memory constraint that has been found based on our algorithm. In this example we have used a memory constraint M LIMIT = 5500 bits. The final test length for this memory constraint is 542 clock cycles and that gives us a test schedule depicted in Figure 7. 
Conclusions
We have presented an approach to the test time minimization problem for multi-core systems that are tested with a hybrid BIST strategy. A heuristic algorithm was proposed to minimize the test length for a given memory constraint. The algorithm is based on the analysis of different cost relationships as functions of the hybrid BIST structure. To avoid the exhaustive exploration of solutions, a method for the cost estimation of the deterministic component of the hybrid test set was proposed. We have also proposed an iterative algorithm, based on the proposed estimates, to minimize the total test length of the hybrid BIST solution under the given memory constraints. Experimental results show very high speed of the algorithm, compared to the exact calculation method.
As a future work we would like to investigate possibilities to apply the same approach also for sequential cores with full scan (STUMPS architecture) and partial scan. Additionally we would like to investigate more complex test architectures and include power constraints into the test time minimization algorithm. 
