Abstract -A low hardware overhead scan based BIST test pattern generator (TPG) that reduces switching activities in circuit under test (CUTs) and also achieve very high fault coverage with reasonable length of test sequence is proposed. When the proposed TPG used to generate test patterns for test-per-scan BIST, it decreases the number transitions that occur during scan shifting and hence reduces the switching activity in the CUT. The proposed TPG does not require modifying the function logic and does not degrade system performance. The proposed BIST comprised of three TPGs: Low transition random TPG (LT-RTPG), 3-weight weighted random BIST (3-weight ERBIST) and Dual-speed LFSR (DS-LFSR). Test patterns generated by the LT-RTPG detect the easy-to-detect faults and remain the undetected faults can be detected by the WRBIST. The 3-weight WRBIST is used to reduce the test sequence lengths by improving detection probabilities of random pattern resistant faults (RPRF). The DS-LFSR consists of two LFSR's, slow LFSR and normal-speed LFSR. The DS-LFSR lowers the transition density at their circuit inputs.
I. INTRODUCTION
Modern design and package technologies make external testing increasingly difficult and the Built-in Self-test (BIST) has emerged as a promising solution to the VLSI testing problem. BIST is a design for testability methodology aimed at detecting faulty components in a system by incorporating test logic onchip. The main components of a BIST scheme [1] are the test pattern generator (TPG), the response compactor, and the signature analyzer. Among various BIST schemes, pseudo random BIST is most widely used since it is most economical. Unlike deterministic stored pattern BIST, which requires the high hardware overhead due to the memory devices required to store precomputed test patterns, pseudorandom BIST , where test patterns are generated by pseudo random pattern generators such linear feedback shift registers (LFSRs) and cellular automata (CA), requires very little hardware overhead. However, BIST using only pseudo random patterns doesn't provide high fault coverage due to the existence of random pattern resistant faults (RPRF). Several techniques have been suggested for enhancing the fault coverage achieved with BIST. These techniques can be classified as: (1) Mixed-mode testing where the circuit is tested in II phases. In the first phase, pseudo-random patterns are applied. In the second, deterministic patterns are applied to target the undetected faults. Storing deterministic patterns in a ROM requires a large amount of hardware overhead. (2) Modifying the circuit under test by test point insertion or by redesigning the CUT to improve the fault detection probabilities. The drawback of these techniques is that they generally add extra levels of logic to the circuit that can degrade system performance. (3) Weighted pseudo-random patterns, where the random patterns are biased using extra logic to increase the probability of detecting RPRFs. In weighted random pattern testing, the outputs of TPG are biased to generate test sequences that have nonuniform signal probabilities to increase detection probabilities of RPRFs that escape pseudorandom test sequences, which have a uniform signal probability of 0.5.
A 3-weight WRBIST can be classified as an extreme case of conventional weighted random pattern testing BIST. However, in contrast to conventional weighted random pattern testing BIST where various weights, e.g., 0, 0.25, 0.5, 0.75, 1.0, can be assigned to outputs of TPGs, in 3-weight WRBIST, only three weights, 0, 0.5, and 1, are assigned. Since only three weights are used, circuitry to generate weights is simple; weight 1 (0) is obtained by fixing a signal to a 1 (0) and weight 0.5 by driving a signal by an output of a pseudorandom pattern generator, such as an LFSR. Though the attainment of high fault coverage with practical lengths of test sequences is still one concern of BIST techniques, reducing switching activity (SA) has become another important objective. It has been observed that SA during test application is often significantly higher than that during normal operation. The correlation between consecutive random patterns generated by an LFSR is low; this is a well-known property of LFSR generated patterns [6] . On the other hand, significant correlation exists between consecutive patterns during the normal operation of a circuit.
Since heat dissipation in a CMOS circuit is proportional to switching activity, a CUT can be permanently damaged due to excessive heat dissipation [4] . The rest of this paper is organized as follows. The techniques that are used in this paper to reduce switching activity during BIST are illustrated in Section III. Section IV and V briefly introduces the serial fixing 3-weight WRBIST and DS-LFSR. The architecture of the proposed TPG and the outline of algorithm used to design the proposed BIST TPG are described in Section VI. Finally, Section VII presents the conclusions.
II. BACKGROUND
In this paper, we assume that the sequential circuit has n primary and m state inputs, implemented in CMOS, has full scan, and employs scan based BIST to apply test patterns and observe responses. A test pattern for state inputs, which is generated by a TPG, is scanned into a scan chain for m cycles by repeated scan shift operations and the response to the applied test pattern is captured into the scan chain in the next cycle. We also assume that unlike state inputs, for which test patterns are applied o the CUT via the full-scan chain by a sequence of scan shift operations, test patterns for primary inputs are applied to the CUT every m+1 cycles from a TPG whose outputs are directly connected to primary inputs or through a boundary scan chain whose outputs are updated every m + 1 cycles. Hence, most SA in CUTs that have long scan chains, i.e., m >> 1, is caused by transition at scan inputs during scan shift operations. This will increase test application time by about a factor of d if scan flipflops are clocked at 1/d speed during scan shift operations. Furthermore, reducing test clock speed doesn't solve high power/ground noise problem that is caused by a large number of simultaneous transitions. A transition at the scan chain input at a scan shift cycle t, which is caused by scanning in a value that is opposite to the value that was scanned in at the previous scan shift cycle t-1, continuously causes transitions at scan inputs while the travels through the scan chain during the following scan shift cycles. Fig. 1 shows scan shift operations for a scan chain that has 5 scan flip-flops. ince a 0 is scanned into the scan chain at time t = 0, the 1 that is scanned the scan chain at time t = 1 causes a transition at the scan chain input and continuously causes transitions at scan flip-flops it passes through until it arrives at its final destination at time t =5. On the other hand, the next 1, which is scanned into the scan chain at time t = 2, causes no transitions at the scan chain input and arrives at its final destination at time t=5 without causing any transition at scan flip-flops it passes through. This shows that transition that occurs at the entire scan chain can be reduced by reducing transitions at the scan chain input. LOW TRANSITION RTPG
A. Analysis of LT-RTPG
The combinational part of a sequential circuit can be viewed a collection of output cones, where an output cone j  is composed of all the logic and inputs (primary and state) that form the fan-in of j th output. A pair of inputs are said to be compatible if there exists no circuit cone to which they both belong. Any correlation between the values applied to a pair of compatible inputs does not reduce the fault coverage for any given test length for faults such as stuck-at. Consider a full scan circuit with a single scan chain. Let the span S j of cone j  be the distance between first and last flip-flops in the scan chain [2] , whose outputs drive the state 
  
, then span S of the circuit is defined as the maximum of spans of all its cones. For the above type of faults, it is sufficient to apply all possible patterns to each set of S consecutive flip-flops of the scan chain to guarantee coverage of all faults [4] .
B. Architecture of LT-RTPG design
The LT-RTPG reduces SA during BIST by reducing transitions at scan flip-flops during scan shift operations. Fig.2 shows an architecture called LT-RTPG. The LT-RTPG is comprised of an r-stage LFSR, a k-input AND gate, and a T flip-flop. Hence, it can be implemented with very little hardware. We assume that every LFSR stage D i, where i=1, 2… r, has a normal as well as an inverted output, 
IV. 3-WEIGHT WEIGHTED RANDOM BIST

A. Generators
In 3-weight WRBIST, detection probabilities of RPRFs are improved by fixing part of inputs of the CUT to binary values specified in test cubes for targeted RPRFs. A test cube for a fault is a test that has unspecified inputs and the detection probability of a fault is defined as the probability that a randomly generated vector detects the fault. A generator or a weight set is a vector that conveys information on inputs to be fixed and values to which these inputs are to fixed during 3-weight Weighted random BIST. (i) When input p k that is assigned a 1 (0) in the generator, fixing to a 1 (0) improves detection probabilities of faults that require a 1 (0) at input p k to be detected by a factor of 2. On the other hand, fixing inputs that assigned a U in the generator to a binary value 0 or 1 may make some faults undetectable. Since those inputs are assigned 1 in some test cube(s) and 0 in other test cube(s). If a circuit contains a large number of RPRFs, then test cubes for RPRFs may be assigned opposite values in many inputs resulting in a generator where most inputs are assigned U's. Only a few inputs can be fixed in such generators without making any faults untestable. Hence, if a circuit has a large number of RPRFs, then multiple generators, each of which is calculated from test cubes for a subset of RPRFs in the circuit, may be required to achieve high fault C p 5 p 4 p 3 p 2 p 1 p 0 F dp c 0 1 X 1 0 X 0 f 
B. Architecture of 3-weight WRBIST
Two different scan based 3-weight WRBIST scheme proposed in [10] are serial fixing and parallel fixing 3-weight WRBIST. However, in this paper, the serial fixing 3-weight WRBIST is exclusively used because it has the property to reduce transitions at scan inputs during scan shift operations. The low transition property of the serial fixing 3-weight WRBIST is described in [10] . Fig 6 shows the implementation of a serial fixing 3-weight WRBIST for the generator shown in Fig 5. The scan counter is an (m+1) modulo counter, where m is the number of scan elements in the scan chain. When the content of the generator counter is k, a value for input p k is scanned to the scan chain input. The generator counter selects appropriate generators; if the content of the generator counter is i, generator gen(C i ) is selected to generate T i 3-weight WRBIST patterns. Pseudo random pattern sequence generated by the LFSR are fixed by controlling the gates, AND & OR, with overriding signals s 0 and s 1 ; fixing a random value to a 0 is achieved by setting s 0 to a 1and s 1 to a 0 and fixing a random value to a 1 is achieved by setting s 1 to a 1. Since a random value can be fixed to a 1by setting s 1 to a 1 independent of the state of s 0 , the state of s 0 is a don't-care when fixing a random value to a 1. The outputs of the decoding logic [3] , D 0 and D 1 , are generated by the outputs of 1 ) is assigned a 1. In consequence, the on-set of the function for the decoding logic lists the contents of the generator and scan counter at test cycles when TF 0 and/or TF 1 require toggling. The scan counter is required by all scan-based BIST schemes and is not particular to the proposed BIST scheme. All BIST controllers also need a vector counter that counts the number of test patterns applied.
The generator counter can be implemented with [log 2 m] MSB stages of the existing vector counter, where m is the number of generators. Hence, no additional hardware is required for the generator counter, either. Hence, hardware over-head for implementing a 3-weight WRBIST is incurred only by the decoding logic and fixing logic, which includes the two T flip-flop and AND and OR gates. In [10] , in order to minimize hardware overhead for the decoding logic, the number of minterms in the on-set (or off-set) of the function for the decoding logic is minimized. This is achieved by ordering scan chains such that the number of toggles at TF 0 and TF 1 required to scan in appropriate values for scan inputs, which are specified in generators, is minimized. In order to minimize the number of toggles required at T flip-flops TF 0 and TF 1, Inputs that are assigned the same values in most generators are placed in neighbor in the same chain by the scan ordering procedure. These pairs of inputs are placed in neighbors to minimize toggles required at the T flipflops.
V. DUAL SPEED LFSR
Dual-Speed LFSR (DS-LFSR) TPG consists of two LFSRs, a slow LFSR is driven by a slow clock and normal speed LFSR. The use of DS-LFSR lowers the transition density at the circuit input driven by slow LFSR, leading to a reduction in heat dissipation during test application with slight area over head. Dual Strategy is employed to reduce both Average and Peak power [5] . The slow LFSR is driven by a slow clock whose speed is l/dth that of the normal clock which drives the normal-speed LFSR. The DS-LFSR is designed in such a way that the generated patterns are all unique and uniformly distributed to achieve high fault coverage. Fig. 7 shows a BIST architecture which is equipped with a slow and a normal-speed LFSR. The empirical analysis using x 2 tests demonstrates that the DS-LFSR generated sequences are more uniformly distributed than the sequences generated by single LFSR's with primitive feedback polynomials. The slow speed LFSR will be clocked by a clock whose frequency is l/dth of that of the normal clock i.e. slow clock speed = normal clock speed/d. (To simplify the following discussions well as the hardware, d will be assumed to be a power of 2) Note that the slow LFSR has both slow clock and normal clock as clock inputs and has a control signal select clock which selects either slow clock or clock. Slow clock is selected when the slow LFSR is used as a test pattern generator; CLK is used when the CUT is in the normal mode or the slow LFSR functions as a multiple input signature register (MISR).
VI. PROPOSED LOW POWER BIST TPG
The architecture of the proposed low power BIST TPG with single scan chain is shown in Fig. 8 . The proposed BIST TPG provides three different sub TPGs: an LT-RTPG, a 3-weight WRBIST and a DS-LFSR. A 2x1 multiplexer can be used to select any one of the inputs from LT-RTPG or WRBIST. LT-RTPG is comprised of a T flip-flop & AND gate [8] . In the first test session, test patterns generated by the LT-RTPG are selected and scanned into the scan chain to detect the easy to detect faults. In the second session, test pattern generated by the 3-weight WRBISTare selected to detect the faults that remain undetected after the first session. Considering the fact that an LT-RTPG is implemented with very little hardware overhead (only y T flip-flops and y AND gates in addition to an DS-LFSR), overall hardware overhead to implement the proposed TPG is determined by hardware overhead for the decoding logic of the 3-weight WRBIST TPG. The y multiplexers that drive scan chain inputs select test sequence to be scanned into the scan chains; when the mode select signal, which selects inputs of all y multiplexers, is set to a 0, test patterns generated by the LT-RTPG are selected and when the mode select signal is to set to 1, test patterns generated by the 3-weight WRBIST TPG are selected. The test patterns generated by the LT-RTPG are applied to the CUT until no new faults are detected for a predetermined period. The faults that remain undetected by test patterns generated by the LT-RTPG are considered RPRFs and later targeted by test patterns generated by the 3-weight WRBIST TPG [11] .
VII. CONCLUSION
This paper proposes and presents a low hardware overhead test pattern generator (TPG) for a scan based BIST that can reduce SA in CUTs during BIST and also achieve very high fault coverage with a reasonable length of test sequence. 
