Built-in self-test test registers must segment a circuit such that there exists a feasible test schedule. If a register transfer description is used for selecting the positions of test registers, the space for optimizations is small. In this paper, 1-bit test cells are inserted at gate level, and an initial test schedule is constructed. Based on the information of this schedule, test cells that can be controlled in the same way are assembled to test registers. Finally, a test schedule at RT level is constructed and a minimal set of test control signals is determined. The presented approach can reduce both BIST hardware overhead and test application time. It is applicable to control units and circuits produced by control oriented synthesis where an RT description is not available. Considerable gains can also be obtained if existing RT structures are reconfigured for self-testing in the described way.
INTRODUCTION

Test registers for BIST
Built-in self-test is one of the most important techniques to test large and complex circuits. Test registers are added at the primary inputs and outputs of the circuit, and some additional test registers are inserted into the circuit. These multi-mode test registers generate patterns or compact test responses during test application (e.g. [ 13, 17, 261) .
In the test mode, the circuit is segmented into a set of subcircuits that are completely bounded by test registers (see figure 1 ). For testing a portion of the circuit, at least one test register must collect test responses. Thus the smallest region that can be tested independently (test unit) consists of one test register that can be configured as a multiple input signature register (MISR), the block of logic connected to the inputs of this register, and a set of test registers to generate test patterns for the inputs of the * This work was supported in part by ARCHIMEDES.
Hans;-Joachim Wunderlich * lnstitiite of Computer Structures University of Siegen, Germany block (cf. [8, 181) . If the collected signature differs from the correct signature, the circuit is faulty.
In this way, every test unit u(Ti) is uniquely determined by the test register Ti at its outputs. In figure 1 , the test unit u(T4) includes test register T4 (response compaction), logic block 1, emcl the test registers T i and T2 (pattern generation). The block contained in the test unit usually consists of combinational or pipeline structured logic. Test units may overlap. In order to obtiii13 testable subcircuits, the test registers must be placed i3t appropriate positions. It has been shown independently b y several authors that breaking all cycles in the circuit structure bounds the length of the required test sequerices to the sequential depth of the circuit [4, 6, 11, 19, 201 . To keep the hardware overhead low the number of flip-flops that (are integrated into test registers in order to break all cyclles should be as small as possible. If the topology of the storage elements is represented by a socalled S-graph whose vertices correspond to flip-flops and whose edges indicate combinational paths between flipflops, then this problem is equivalent to finding a "Minimum Feeidback Vertex Set" [9] . Some authors also address extensions of this basic approach, as for example targeting a pipeline structure, limiting the sequential depth of the circuit, or considering timing constraints [4, 6, 11, 14, 19, 201.
At RT level, the register graph GR := (VR, ER) is the counterpart of the S-graph. The nodes VR represent the registers, and there is a directed edge between two nodes if there exists a path of combinational elements between the corresponding registers (see figure 2) . The test register graph GT := (VT, ET) is an abstraction of the register graph and describes the dataflow between the test registers. For each path in GR that connects two nodes of VT only via nodes of VN there is a corresponding edge in ET. If for instance VT = {R3, R4, Rg} (figure 2), then the test register graph is as shown in figure 3 and each cycle of the register graph contains one test register. 
General register transfer structures:
The register configuration of the system mode is not always optimal for testing.
As an example, figure 6 shows a carry save adder (CSA) and its register graph. Such a circuit is often used for implementing sequential multiplication [lo] . The register graph contains two self-loops, and two additional transparent test registers B' and C' of length n are required for making ,it self-testable. Figure 7 shows the test register graph and the corresponding test incompatibility graph for random testing. Figure 10 shows a simple example, which will be used throughout the paper to explain the proposed approach.
TEST SCHEDULING AT GATE LEVEL
Since the S-graph contains two self-loops, two storage elements that are transparent in the normal mode, ' 10 and rl1, have to be inserted. The storage elements '3, r4, 1-6, r10, (and rl1 are selected to become test register cells (e.g. I-bit elements of a BILBO or a cellular automaton). Then each cycle of the S-graph contains two test cells. Further test cells are added at the primary inputs (rl, 1-2) and outputs (rg, rg). The mode vectors of the example are mtl = mtz = (0, 2), mt3 = mt4 = (1, 01, mt6 = (0, 11, mts = mt9 = (2, l), We get the test registers T2 = TO = {t3, 4}, T3 = TI = {t6, tio, til}, Ti = (ti, t2} at the primary inputs, and T4 = (t8, t9) at the primary outputs. Figure 11 shows how the test control unit and the test registers are connected. Synthesized test registers and test control unit for the example of figure 10 The resulting conditions are summarized by a directed tree (precedence tree) where the nodes represent the test units and an edge (u(Ti), U(Tj)) means that the test unit u(Ti) must be processed before the test unit U(Tj). A dummy node "end' is added to indicate the end of the test, and edges are inserted from all the nodes u(Th), Th€Omin, to the node "end". Figure 12 shows the test register graph, the precedence tree, and the test incompatibility graph for the example of figure 10.
SYNTHESIS 01F TEST REGISTERS
TEST SCHEDULING AT RT LEVEL
of Omin. Example: Test register graph GT (left), precedence tree P (center), and test incompatibility graph GI (right)
The procedure CONSTRUCT-SEQ implements this scheduling approach (for a detailed description cf.
[22]). As inputs it takes the precedence tree P, the test incompatibility graph GI, and the period d of the test session sequence to construct. The results are the test session sequence s of length d ' l d and the number r of repetitions necessary to propagate the effects of all faults to the test registers of Omin. For the example circuit described in figure 10 and figure 11 , the resulting test schedules are (([u(T2)}, {u(T3), u ( T~) } )~, {Tq}) for d=2, and (({u(T3)), {~(Tz)}, Iu(T4)}), V41) ford=3. If the test lengths of the test units are not extremely short and the test registers are sufficiently large (e.g. 20 bits or more), the probability of aliasing is very small and can be neglected even if faulty signatures are propagated through several test units [23] .
In order to control the test registers according to the scheduling of CONSTRUCT-SEQ, a new set of control signals is required. The mode vectors of the test registers, maximal sets of com atible mode vectors and finally the control same way as at gate level (see sections 2 and 3).
signals c(po s , ..., c(pd*-l) can be determined in the
THE COMPLETE APPROACH
In this section, the ideas developed above are put together, and the complete approach is described. ..., /* control of test registers */
Input:
The graph coloring problems that have to be solved in step (5) and step (8) are NP-complete problems [9] . But many efficient heuristics are known that give good (suboptimal) solutions. We applied the algorithm of [7, pp. 70-711. This algorithm first determines an initial coloring using a greedy strategy and then tries to improve this solution by recoloring some nodes. All possible solutions are implicitly enumerated. Solutions with 1, 2, or 3 colors are guaranteed to be a minimum coloring.
EXPEFUMENTAL RESULTS
The described procedures have been applied to the large circuits of the ISCAS'89 benchmark set [2] . For BIST, test cells were added at the primary inputs and outputs, and additional test cells were inserted such that each cycle of the circuit structure contained at least two of them. The test incompatibility graph for the I-bit test units, G I~, was constructed assuming pseudo-random testing. Then the nodes of this graph were colored using a minimal number of colors. In those cases where exhaustive search for a minimum coloring took too much time, we stopped the recoloring process after 10000 trials to improve the solution and used the best solution found so far.
The result of these first steps is a minimal number of test sessions that are based on 1-bit test units (see table 1 Moreover, reducing the number of test sessioins usually leads to a shorter overall test length. Of course, the test length also depends on the type of the test registers and on the fault coverage value that has to be achieved. The method presented is compatible with different kinds of test registers; [24] . For example, using test registers that can produce weighted random patterns generally rt:sults in a shorter lest length for the considered test unit than using unbiased1 random patterns.
As an alternative to test schedules with minimal number of test sessions, the procedure CONSTRUCT-SEQ gives schedules where all the test response information is driven to the primary outpul.s, and as a consequence only the few signatures at the primary outputs have to be evaluated (see 
