Dependability of systems has become one of the most important engineering concerns. With new technologies new paradigms of testing are emerging. One of them is to make systems self-testing, and the quality of self-test is the key issue. In this paper, a new methodology of functional Built-In Self-Test is proposed, which stresses testing in dynamics to achieve the highest fault coverage. The main novelty of the proposed approach is in using the inherent functionality of systems for testing them at-speed in normal working conditions. The proposed self-test includes on-chip test application and response collection by using the native instructions of the processor under test. A hierarchical divide-and-conquer approach is applied. At component level, tests are targeting structural faults in components whereas at processor level, the functionality of the processor is used to apply structural tests to each component at-speed. Differently from similar approaches, the sequences of component test patterns are not needed to store in the chip under test, they will be generated on-line by the resources of the system. A framework for synthesis of self-test stimuli data is proposed.
Introduction
We have entered the era of pervasive computing. PCs have been dethroned by technology to embed computers in almost everything. 98% of computers in the world are embedded. Massive amounts of electronics and software are controlling our everyday life. We do not even notice our dependence on the digital world, except when something is going wrong in this world. Errors and faults in computing hardware may lead to severe consequences threatening our lives, ecology or expensive technology. We need dependability from the embedded computing world around us.
Silicon technology is continuously scaling. Integrated circuits are implemented with smaller and smaller transistors which operate at higher clock frequencies, and run at lower voltage levels. According to International Technology Roadmap for Semiconductors, the industry should have the 22-nm technology by 2016 [1] .
Increases in the complexity of circuits and process shrinks in turn, increase the susceptibility of chips to various types of faults, especially to delay faults, requiring more accurate tests than before. The testing of digital systems in dynamics by so called at-speed testing has become the must. However, as the speed of microprocessors approaches the GHz range, at-speed testing is becoming increasingly difficult with traditional external test equipments.
A lot of research has been carried out to relieve the burden of external testers by introducing test-dedicated circuits which could be implemented in chips as Built-In Self-Test (BIST) solutions [2] . In this approach the typical functions of external testers like test pattern generation and response analysis are carried out on-chip, so that the tester should not handle high-speed signals externally and their role should remain only to send the test enable signals to the chip under test, and to receive the pass/fail signals. For example, scan-based and BIST solutions such as [3] relax much the requirements on testers and reduce the overall test cost.
State-of-the-art
In traditional BIST architectures, test pattern generation is mostly performed by ad hoc circuitry, typically Linear Feedback Shift Registers (LFSR) [2] , cellular automata [4] or multifunctional registers like BILBO (Built-in Logic Block Observer) [5] . BIST involves on-chip hardware to apply pseudorandom test patterns to the circuit under test and to analyze its output response. The most widespread approach is test-per-scan BIST scheme [5] . Unfortunately, many circuits contain random-patternresistant faults which limit the fault coverage that can be achieved with this approach.
Improvement of the fault coverage for a test-perscan BIST can be achieved by modifying the tested component by either inserting test points [6] or by redesigning it to improve the fault coverage [7] . The drawback of these techniques is that they generally add additional logic levels to the circuitry that can degrade system performance.
Another method to improve the fault coverage is to use a "mixed mode" approach where deterministic patterns are added to detect the faults that the pseudorandom patterns miss [8, 9] . The disadvantage of the method lies in the large amount of hardware overhead that may be needed to store the deterministic test patterns. Traditional BIST solutions use special hardware for pattern generation and test response evaluation on chip, but this in general introduces significant area overhead and performance degradation. To overcome these problems, methods have been proposed which exploit specific functional units for on-chip Test Pattern Generation (TPG) and test response evaluation (TRE) [10] . In particular, it has been shown that adders can be used as TPGs for pseudo-random, pseudo-exhaustive and deterministic patterns [11, 12] .
Most of studies in the field of BIST has been focused on developing self-test algorithms in which the role of external testers is limited to sending the test enable signal and receiving the pass/fail signal. However, only memory BIST techniques have been most successful so far. The main reason of success of the memory BIST has been the simplicity of generating deterministic tests coupled with simple and regular structures of test generation schemes. In the same time, current logic BIST techniques based on using pseudorandom patterns remain mostly impractical due to low fault coverage, associated hardware overhead, performance degradation and excessive power consumption [13] .
The term "functional BIST" (FBIST) describes a test method to control functional modules, so that these generate test sets which target structural faults within other parts of the system. It is a promising solution for self-testing of complex digital systems at reduced costs in terms of area overhead and performance degradation.
Software based self-test (SBST) conception is an approach that has gained increasing acceptance for testing processor cores and other components in Systems-onChip (SoC) [14] . SBST moves the test functions from external testers to on-chip resources whereas the test patterns are produced by the processor itself using its native instructions. Usually, in this approach, the test programs and associated test data are first loaded into onchip memories, and subsequently, these test programs are executed by the processor at actual/full speed (at-speed).
The problem is still in generation of high quality test data -operands to be used by the native instructions which build up the test program.
In [15] , a divide-and-conquer approach to functional BIST is presented where the data for testing the components in chips are pre-generated and stored in the memory to be later during the test phase encapsulated into self test routines. The disadvantages of this approach are first, the huge number of component test patterns that are needed to be stored in the chip, and second, the difficulties to overcome the instruction-imposed constraints when delivering the needed test patterns to the components.
In this paper we propose a functional BIST for using in component oriented testing in digital systems. Differently from state-of-the-art, no additional test specific hardware is needed, and the test patterns are not stored in the chip. The test patterns are generated on-line by the functional resources of the system, and at-speed testing guarantees higher fault coverage compared to the traditional BIST.
The rest of the paper is organized as follows. In Section 3 a general scheme of the proposed BIST is described, followed by the framework of BIST synthesis in Section 4. Experimental results are presented in Section 5, and Section 6 concludes the paper.
General scheme of the functional BIST
The main idea of the proposed FBIST conception includes the use of activated on-chip functional processes as test pattern generators for a selected Component Under Test (CUT) and monitoring the behaviour of CUT by a Multiple Input Signature Analyzer (MISR). MISR is the only additional hardware needed for the implementation of FBIST. At the processor level, the functionality of the processor is used to apply the structural tests to each CUT at-speed. The tests are delivered by processor instructions and unfolded by microinstructions produced in local control units.
Consider a data-path of a processor in Fig. 1 with ALU as a CUT. The data path consists of a register block for temporary storing the data participating in an operation carried out in the Arithmetic Logic Unit (ALU). For example, during the operation of division the register block will store the dividend, divisor, all intermediate results of division, the quotient, and the counter of cycles needed for the whole process of division.
The input data from the register block and the control signals from the control unit are interpreted as input test patterns for the CUT. The output data from ALU sent back to the register block and the status signals as feedback to the control unit are interpreted as responses of CUT, and are registered in MISR. Denote by L the number of bits in the data operands (dividend and divisor), and by l the number of bits on the inputs of ALU. Then the reduction in the test data volume through the compression of test data in the described FBIST scheme is equal to the signature analyzer which is monitoring the whole division process. The whole microinstruction level test process is launched by a division instruction which includes two operands -the dividend and divisor.
Differently from the known approaches where the instructions are regarded as test patterns and the results of the instructions are regarded as test responses, in the proposed case all the input patterns of CUT during each cycle of the instruction are regarded as test patterns with immediate monitoring of the responses of CUT in each cycle by MISR. As the result, we have achieved a multiplication effect of N times in the number of test patterns because of moving the test access from the instruction level to the microinstruction level.
In this scheme the functional patterns produced directly on the inputs of ALU have the similar role as pseudorandom test patterns in classical BIST schemes. To improve the fault coverage of FBIST, the same operation can be carried out with different operands. The problem to be solved is the choice of the best operands to minimize the length of the whole test procedure
The framework consists of four tools: data path simulator, fault simulator, test operands generator and design for testability advisor.
The data path simulator is used for finding the functional test pattern sequence produced by the given sequence of test instructions with related test operands. The fault simulator is used for measuring the quality of the sequence of functional test patterns. If the quality of test is not satisfied, it will be extended by selecting additional test instructions and operands. Such a modification of the test program will be repeated till the test quality is satisfied.
Similarly to the pseudorandom test, the functional test patterns may not be able to cover random-patternresistant faults, which limits the fault coverage that can be achieved with the pure functional BIST approach.
To increase the fault coverage, we can use similar approaches that are used to improve the LFSR-based classical BIST approaches: to modify the CUT by inserting test points, by redesigning it for testability, or by using hybrid approaches, by adding to functional test additional deterministic test patterns.
In case when random-pattern-resistant faults exist in CUT, the design for testability is needed. The fault simulator will be used as the testability advisor which shows the places where test point insertion is needed. To minimize the number of test points to be inserted several methods can be used [6, 16, 17] .
Functional BIST synthesis framework
In Fig. 2 , the methodology and framework are shown for generating test operands for the given CUT with the goal to achieve the highest possible fault coverage with as few test operands as possible.
For generating test operands, either random search or genetic algorithms for selecting best operands and the best sequences of operands can be used. In this research we have developed a genetic test operand generator which is based on Java Genetic Algorithms Package (JGAP) [18] . The tool allows easily to generate different genetic algorithms.
Experimental case study
We have carried out experiments with synthesis of functional BIST for two data-paths: (1) restoring 16-bit integer divider, and (2) non-restoring 16-bit signed integer divider. 
Experiments with restoring divider
The data path consists of 3 registers, cycle counter and 16-bit ALU as the CUT. ALU has 53 inputs and 17 outputs. To improve the testability, 4 test points were inserted which were connected via XOR gate to a single additional output. Test operands were found by random search. The shortest test with 100% fault coverage which was found consists of 3 operands which produces 197 direct input patterns to ALU. Table 1 illustrates the test coverage as the function of the number of used test operands. For each number of operands, 1000 random experiments were carried out, and the average, best and worse results are shown. The same experiments are illustrated as well in Fig. 3 . Fig. 4 demonstrates statistically how fast it would be possible to generate for the circuit a test with 100% fault coverage. 1000 experiments were carried out, and in each experiment random operands were added to the test till the 100% fault coverage was achieved.
Number of operands for 100% fault coverage 
Experiments with non-restoring divider
In this experiment the operands were generated by genetic test operand generator. To have an understanding about the difficulty of generating high quality test operands, an experiment was carried out with 10000 and 100000 random samples, and the fault coverage of each test operand sample was measured. The 10 best results are included in Table 2 . In Table 3 , the results of comparing the random and genetic test operand generating approaches are depicted.
In case of genetic algorithm, the shortest 100% test generated included 4 operands. However, by adding additional test points to improve the testability of the circuit, it was possible to reduce the number of operands up to three, as in the case of the restoring divider. To minimize the number of test points, and to demonstrate the possibility of trade-off between hardware cost, test synthesis time and fault coverage, several experiments with genetic algorithm were carried out. The columns 2-6 in Table 4 mean, respectively, the numbers of evolutions, the numbers of populations, the numbers of test points needed for adding into the circuit to achieve 100% fault coverage with 3-operand tests, the fault coverage achieved by genetic algorithm without adding test points, and the time used for test synthesis. The 5 th experiment shows that we need only a single added test point to achieve 100% fault coverage with 3-operand test.
Conclusion
We introduced a new approach to organize functional self-test in digital systems to enable at-speed testing in the system normal working conditions. Differently from stateof-the-art, no additional hardware is needed, and the test sequences are not stored in the chip, rather are generated on-line by system resources alone. At-speed test guarantees higher fault coverage compared to the traditional BIST.
The method is well scalable since it uses divide-andconquer paradigm, and the system is tested component by component. Processor instructions are used for activating component oriented test procedures. The test operands used by instructions can be efficiently generated by genetic algorithms to achieve the needed fault coverage. Experiments have demonstrated the high effectiveness of the proposed approach.
