Abstract
Introduction
Delay fault testing has become a standard option in today's technology. Path-delay, segment-delay, gate-delay and transition fault models have been proposed so far [9] [10] [12] [19] [20] [23] . These models have different complexity of both test generation and test application.
The path-delay fault model [20] is used to model defects that are correlated along a path from a (pseudo-)primary input to a (pseudo-)primary output of the circuit under test (CUT). Both the switching delays of devices and the transport delays of interconnects may perturb the propagation of signal transitions. Unfortunately, in the worst case the number of path-delay faults may increase exponentially with the number of the signal lines in the CUT.
The segment-delay fault model [9] [19] contains all path segments shorter than a predefined parameter L. As L increases, the number of test segments may become unacceptable large. When L is too small, some paths with larger distributed delay may escape the test. The value of L can be chosen based on statistics about the manufacturing defects.
The gate-delay fault model [10] is a quantitative model in which the exact size of the delay fault needs to be specified in advance. Unfortunately, the determination of the fault size is not an easy task.
The transition fault model [12] [23] also called gross-delay fault model, is a special case of the gate-delay fault model in which the fault is assumed to be of the same order of magnitude as the clock period. The transition fault model is used to cover delay effects which are generated by localized (spot) defects and whose sizes are in the order of magnitude of the clock cycle or of the test pattern period. A slow-to-rise and slow-to-fall transition fault may be associated to each signal line in the CUT. Consequently, the number of transition faults increases linearly with the number of the signal lines in the CUT. Due to its limited complexity, the transition fault model is most widespread and also this paper deals with this model.
In order to test delay faults, two patterns are required, an initialization pattern that sets the circuit to a predefined state, and an activation pattern that launches the appropriate transition and propagates the fault effect to a (pseudo-) primary output. There are two approaches for the application of pattern pairs to a standard scan design [17] [18] [23] :
functional justification, also called broadside, and scan shifting, also called skewed-load. In the functional justification approach, the circuit response to the initialization pattern is used as the second pattern. In order to apply pattern pairs, the circuit is operated two consecutive clock cycles in functional mode after the first pattern has been scanned in. In the scan shifting approach, the second pattern is generated by operating the scan path for one additional clock cycle after the first pattern has been applied. Since scan shifting may require additional efforts for a consistent clocking scheme beyond the BIST hardware, we will concentrate only on functional justification. Another advantage of this approach is the expected limitation of the scan-induced over-testing, as long as the activation pattern is computed by the CUT itself and not scanned in, like in the scan shifting approach. So, the impact on the yield should be less than in the case of the scan shifting approach.
There is a wide range of deterministic logic BIST (DLBIST) methods that apply deterministic test patterns and hence improve the low fault coverage often obtained by pseudo-random patterns. Very attractive schemes, which require no interaction with the ATE during test application time, are the test set embedding schemes. These rely on a pseudo-random test pattern generator plus some additional circuitry that modifies the pseudo-random sequence in order to embed a set of deterministic patterns. Examples for such techniques are the bit-fixing [22] and the bitflipping [6] [25] DLBIST schemes. The implementation of these schemes includes the mapping of a set of deterministic test cubes to an initial pseudo-random test sequence and the synthesis of the circuitry used to embed the target test cubes [6] .
Here, the bit-flipping DLBIST scheme used for the test of stuck-at faults will be adapted to transition fault testing based on functional justification. The resulting DLBIST scheme can be used to test transition faults in circuits with scan design. Until now, no such scheme has been described in the open literature. LBIST approaches for the test of delay faults (especially, for path delay faults) have been presented in [2] The schemes proposed in [3] [13] are pseudo-deterministic, which means that a flexible test length is required in order to guarantee that all the target deterministic patterns are embedded. Only the approaches presented in [11] [21] [26] are deterministic, but they address circuits without scan designs.
The extension of the bit-flipping DLBIST scheme for transition fault testing requires the modification of the test control unit such that the scan enable signal allows two consecutive functional clock cycles and not only one as in the case of stuck-at fault testing.
Since pairs of test patterns are required, transition faults are more difficult to test than stuck-at faults. Consequently, the pseudo-random transition fault coverage is significantly lower than the pseudo-random stuck-at fault coverage. The final effect is an increase of the mapping effort and the logic overhead.
As a solution, this paper proposes a trade-off between these costs and the test application time. Slightly larger test application times reduce logic overhead and enhance test quality in terms of both transition fault and (non-modeled) defect coverage [4] .
The next section describes a quantitative estimation of the random testability of the stuck-at and transition faults.
An approach to extend the bit-flipping scheme for transition fault testing is presented in Section 3. Relevant experimental results for some difficult to test industrial designs are reported in Section 4. Finally, Section 5 summarizes this work.
Pseudo-random testability of transition and stuck-at faults
Here, the random detection probability is used as a measure of fault testability. The probability that a stuck-at 'i' fault (i∈{0,1}) on the signal line J is tested by a random pattern is equal to the probability P(J sa 'i') that the line J can be controlled to the logic value '¬i' and observed at a (pseudo-)primary output [1] [16]: P(J sa 'i') = P(J is controllable to '¬i' and J is observable)
The probability that a transition fault from 'i' to '¬i' on the signal line J is tested by a pair of independent random patterns is equal to the probability P(J slow from i to ¬i) that the first pattern controls the line J to the logic value 'i' multiplied by the probability that the second pattern controls the line J to the logic value '¬i' and can make the signal line J visible at a (pseudo-)primary output:
From the relation (2), it results that if the stuck-at 'i' fault on a signal line is random pattern resistant, then the transition fault from 'i' to '¬i' on the same line is random pattern resistant as well. Consequently, a digital circuit contains at least as many random pattern resistant transition faults as random pattern resistant stuck-at faults.
The following expression gives the number N f of random patterns required to test a fault 'f' having the detection probability P f (<<1) with the test confidence C (probability that the considered fault is tested by at least one pattern of the test sequence) [24] .
Consequently, for two faults 'f' and 'g', we have:
From the relations (2) and (4), one can observe that in the case when controllability to i of a signal line is close to '0', a slow transition fault from 'i' to '¬i' on the same line may require a much larger number of random test patterns than the corresponding stuck-at fault, if the same test confidence is the objective.
As an example, consider the circuit sketched in Figure 1 . The inputs of this circuit are driven by four scan flipflops. A scan enable signal (SE) is used to switch the scan flip-flops between scan and functional modes. The possible patterns to test the stuck-at '0' and '1' faults on the net J are ABCD ∈ {1XXX, X11X} and {00XX, 0X0X}, respectively. Based on functional justification, the initialization pattern ABCD = 0011 will generate an activation pattern 0111 such that a slow-to-rise fault on the net J can be tested. The slow-to-fall transition fault on the net J cannot be tested.
Note that there is only one initialization pattern for the slow-to-rise fault on the net J which has four specified bits, while each of the two corresponding stuck-at faults has two test patterns with one or two specified bits. The probabilities to randomly detect the slow-to-rise, the stuck-at '0' and the stuck-at '1' faults on the net J are 1/16, 5/8 and 3/8, respectively. The reduced testability of the transition fault is due to the fact that the initialization pattern must not only set the required initial logic value on the target line, but it must also generate appropriate logic values at the CUT outputs in order to define an useful activation pattern. The lower level and the slower saturation of the transition fault coverage is due to the larger number of random pattern resistant transition faults and to the larger number of pseudo-random patterns required by these faults to achieve the same test confidence as in the case of the random resistant stuck-at faults.
This slow saturation is expected for any type of delay fault testing and it may be enhanced in the case of robust delay fault testing. It increases the necessity of having long test sequences when DLBIST is used. Moreover, a DLBIST scheme is necessary that is able to use the whole test sequence for pseudo-random fault detection.
Bit-flipping DLBIST for transition faults
The bit-flipping DLBIST scheme provides both pseudo-random and deterministic test patterns. Most of the pseudo-random test patterns do not contribute to the fault coverage, since they only detect faults already detected by previous pseudo-random patterns. Such useless pseudo-random test patterns may be modified into useful deterministic test patterns to improve the fault coverage. The deterministic test patterns are determined by an ATPG tool, and they target those testable faults not detected by pseudo-random test sequence. In such a deterministic test pattern (cube), only few bits are actually specified, while most of the bits are don't cares and hence can arbitrarily be set to
The modification of the pseudo-random patterns is realized by inverting (flipping) some of the LFSR outputs, such that they become compatible with the deterministic test cubes. This flipping is performed by combinational logic, which implements a bit-flipping function (BFF). The BFF can be kept quite small by exploiting the large number of useless pseudo-random patterns that may be modified, and carefully selecting the pseudo-random patterns which are altered to deterministic patterns.
As shown in Figure 3 , the BFF inputs are connected to the pattern counter (PC), shift counter (SC) and LFSR, while the BFF outputs are connected to the XOR-gates at the scan inputs. The BFF determines whether a bit has to be flipped based on the states of the PC, the SC and the LFSR. The PC is part of the test control unit, and counts the number of test patterns applied during the self-test. The SC is also part of the test control unit, and counts the number of clock cycles required for shifting test data in (and out of) the scan chains. If a phase shifter (PS) is attached to the LFSR, its output may be used to control the operation of the BFF as well.
The BFF realizes the mapping of deterministic test cubes to pseudo-random test patterns in the following way.
Every specified bit (i.e. care bit) in a deterministic cube either matches to the corresponding bit in the pseudo-random pattern, in which case no bit-flipping is required, or the bit does not match, in which case bit-flipping is required. For all unspecified bits (i.e. don't-care bits) in a deterministic cube, the corresponding bits in the associated pseudorandom pattern may be arbitrarily flipped or not.
The BFF must provide that (1) all conflicting bits are flipped, (2) all matching bits are not flipped while (3) the don't-care bits may be arbitrarily flipped or not. A BDD-based method for the generation (mapping) and the logic implementation of the BFF that scales well with the size of the core under test (CUT) is described in [6] .
Below, each pattern of a test sequence that can detect faults not detected by any of its precedent patterns will be referred to as an essential pattern. The embedding of deterministic test cubes into a pseudo-random test sequence may corrupt the essential pseudo-random patterns even if they are not assigned to deterministic cubes. This is due to the fact that the logic synthesis and optimization of the BFF are intensively using its don't care (DC) set (set of input assignments for which it does not matter whether they are mapped to '1' or '0') [6] . Consequently, the resulting bitflipping logic flips more bits than necessary for the embedding of the target deterministic test cubes and the consequence is that essential pseudo-random test patterns may be corrupted.
Due to the slower saturation of the random transition fault coverage, the application of longer test sequences and the protection of all the essential pseudo-random patterns become critical. A way to prevent the corruption of the essential patterns without increasing the complexity of the BFF implementation is to utilize an additional combinational module, here referred to as correction logic (CRL), to enable/disable the outputs of the bit-flipping logic (Figure 3 ).
In the case of transition fault testing based on functional justification, the test control unit (Figure 3 ) has to be able to provide pairs of capture clock cycles. In such a scheme, only the initialization pattern of each pair of test patterns is generated by the DLBIST hardware. The activation pattern is generated by the CUT as a response to the first pattern and only single test cubes have to be embedded into the pseudo-random sequence (just like in the case of stuck-at fault testing). This is also true in the case of transition fault testing based on scan shifting, with the observation that in this case the test control unit should generate one additional shift clock cycle instead of the first capture clock cycle. For this reason, as long as ATPG support is provided the methodology presented here can also be applied to the scan shifting approach. Moreover, this methodology is orthogonal to any method used to extend transition faults testing to circuits with multi-cycle path logic, like modifying the wave form of shift and capture clock signals or enhancing the scan design [15] .
The steps used to implement the proposed scheme are detailed below.
1. Initial fault simulation is used to detect the transition faults that cannot be tested by the pseudo-random test sequence produced by an LFSR and, optionally, a PS. During this step a list L 1 is generated containing the indices of all the essential pseudo-random test patterns.
2. An ATPG tool is used to generate a limited number of deterministic initialization test cubes for all or a sub-set of the transition faults that remained undetected by the pseudo-random test sequence.
3. A pseudo-random test pattern is assigned to each deterministic initialization test cube. For each such cube only a limited number of pseudo-random patterns are checked starting with the ones at the minimum Hamming distance. The essential pseudo-random test patterns are not allowed to be assigned. Each assigned pseudo-random test pattern is modified by bit-flipping to become compatible with the corresponding deterministic test cube.
During this step, a BDD-based representation is generated for the resulting BFF [6] . The BFF is only partly specified and has a large DC-set.
4. The BDD-based representation of the BFF is optimized and transformed into a circuit description (e.g. RTL VHDL or Verilog) [6] [7] . The logic optimization procedure exploits the DC-set left by the incomplete specification of the BFF. Consequently, the optimized bit-flipping logic flips additional bits besides the conflicting bits in the assigned pseudo-random patterns, so that essential pseudo-random patterns (with the index included in L 1 ) may be corrupted. The output of the CRL has to be kept constant during the scan clock cycles corresponding to each test pattern.
Due to the fact that some of the input signals to the CRL (the state bits of the LFSR and the output bits of the PS (if any)) change during the scan clock cycles, a latch is used to store the output of the CRL. The operation of the latch is controlled with the help of the scan enable signal (SE), so that its state can be changed only before a new test pattern is scanned in.
Without the CRL, all the inputs of the BFF corresponding to the essential pseudo-random patterns should be included into the OFF-set of the BFF. Consequently, the use of the CRL leaves more DC-space to optimize the logic implementation of the BFF. On the other hand, storing the output of the CRL into a memory element that cannot be written during the shift cycles increases significantly the degrees of freedom for the optimization of the CRL.
While a BFF function has to be implemented for each scan chain, the CRL is common for all the scan chains.
7. In the end, a simulation of the embedded test sequence generated by the LFSR, PS (if any), BFF and CRL determines the final transition fault coverage.
In order to limit the size of the correction logic, the set of deterministic cubes is embedded (mapped) only at the end of the pseudo-random test sequence. The length of the pseudo-random sequence that can be modified is a fraction given by a negative power of two of the total test length. The first part of the test sequence is used only for pseudo-random fault detection and it is protected from being flipped by disabling the outputs of the BFF with the help of an AND gate per scan chain, which is controlled by a combination of the most significant bits of the PC.
Experimental results
The experimental results reported below have been obtained using GNU Linux machines equipped with 2 GB of memory and an Intel Pentium 4 processor running at 2.4 GHz. The industrial designs described in Table 1 corresponds to a test sequence whose application would require one second at the frequency of 100 MHz. It can easily be observed that the transition fault efficiency is much lower. In general, increasing the test sequence length has a stronger impact on the transition fault detection than on the stuck-at fault detection.
In Table 2 , one can compare the results obtained using the bit-flipping DLBIST approach for the stuck-at and transition fault testing of the designs in Table 1 . Table 2 reports the number of embedded deterministic test cubes, the ratio (percent) of specified bits in the set of embedded cubes, the achieved final fault efficiency and the cell area overhead of the BFF and CRL implementations for both fault models. The overhead of the other parts of the DLBIST hardware is relatively small and it may be neglected.
In order to limit the hardware overhead in the case of the three largest designs, the number of deterministic test cubes embedded for transition fault testing has been limited to 800.
With the exception of the design p19K, the deterministic test cubes embedded for transition fault testing have larger ratios of specified bits. This is due to the lower transition fault testability. In the case of the design p19K, this lower testability has the effect that the ATPG tool delivers less deterministic test cubes with less detected faults and also less specified bits per cube than in the case of stuck-at fault testing. In all cases, the final stuck-at fault efficiency is much higher than the final transition fault efficiency. Moreover, this has been achieved along with a lower cell area overhead with the exception of the design p19K. The reason for this difference is again the lower random testability of the transition faults with the consequence that more patterns have to be embedded and more bits have to be flipped or preserved in the pseudo-random sequence. This seems not to be the case of the p19K design, but, as mentioned before, here it was just the ATPG that provided fewer deterministic test cubes to be embedded for transition fault testing. For a given test length, the DLBIST hardware overhead depends on the random testability of the CUT and on the amount with which the fault efficiency has to be increased.
In Table 3 , one can observe the impact of increasing the test length on the final fault efficiency and cell area overhead (BFF+CRL) of the considered DLBIST scheme used for transition fault testing. The run-time and memory requirements are reported as well. Table 2 : DLBIST applied to stuck-at and transition fault test, in the case of a 10K patterns long test sequence.
As expected, the hardware overhead of the first two designs, for which the number of embedded patterns has not been limited, is significantly reduced by the increase of the test length. Extending the test length from 10K to 64K
reduces the overhead by more than 10% of the CUT size. In the case of the last entry corresponding to the design p19K, increasing the test length by 2 orders of magnitude has reduced the overhead to half of the level from the previous entry, at the price of a large increase in the run-time and memory requirements.
As in the case of the previous experiments, the number of embedded deterministic test cubes for the three largest designs has been limited to 800, in order to limit the hardware overhead. As long as the same number of deterministic test cubes is embedded, it is difficult to predict the dependence of the hardware overhead on the length of the test sequence. In this case, the overhead primarily depends on the average number of specified bits per embedded test cube, which is determined by the number and the difficulty of the target faults. Longer pseudo-random test sequences leave undetected faults which are more difficult to test. This tends to increase the number of specified bits necessary to detect the remaining fault. On the other hand, this may also decrease the number of newly detected faults per embedded test cube. That is why it is difficult to predict the evolution of the average number of specified bits per embedded test cube when the length of the test sequence is augmented. Increasing the length of the test sequence also improves the pattern embedding opportunities. In the case of the design p127K, increasing the length of the test sequence does not significantly change the overhead, but it improves the final fault efficiency by more than 11%. In the case of the designs p278K and p333K, increasing the test sequence length has a twofold beneficial impact. Choosing a test sequence length of 318K and 140K, instead of 10K reduces the overhead by 11% and 7%, respectively. In parallel, the final fault efficiency is improved by more than 8% and 3%, respectively. The increase of the test length improves the coverage of the non-modeled defects as well [4] . In the case of the three largest designs, increasing the length of the test sequence has no significant impact on the run-time and memory requirements. Table 4 reports possible trade-offs between the fault efficiency and the hardware overhead in the case of the largest three designs. The considered test sequences contain the maximum number of test patterns which can fit in one second of test time at the frequency of 100 MHz. In the case of the designs p127K and p278K, 10 deterministic patterns are already enough to obtain a larger fault efficiency than in the case when 800 deterministic test patterns are embedded into a 10K long test sequence. In this way, the hardware overhead can be reduced to 1% from 43% and 62% (Table 3) , respectively. In the case of the design p333K, similar fault efficiency can be achieved by embedding 100 deterministic test patterns, at the cost of 5.5%, instead of 22%, hardware overhead. 
Conclusions
We presented an extension of the bit-flipping DLBIST scheme for transition fault testing. This is the first time when a DLBIST scheme is used to test transition faults in circuits with standard scan design. A special combinational module, the correction logic (CRL), has been introduced to further improve the pattern embedding. Due to the rather low random-pattern testability of transition faults, the saturation of their random fault coverage requires significantly longer test sequences, which in turn is beneficial for both limiting the hardware overhead and improving the coverage of the modeled and non-modeled defects. The DLBIST scheme considered here can be applied to both approaches to transition fault testing in circuits with standard scan design, functional justification and scan shifting.
Moreover, the presented methodology is orthogonal to any method that can be used to extend transition faults testing to circuits with multi-cycle path logic.
Acknowledgments
This research work was supported by the German 
Figure captions

