Abstract-Concurrent checkers are commonly used in computer systems to detect computational errors on-line, which enhances reliability. Using the coding theory framework developed earlier by the authors, it is shown in the following that concurrent checkers, already available within the circuit, can be utilized very effectively during off-line testing. Specifically, test time as well as fault escape probability can both be reduced simultaneously. The proposed combined scheme can be implemented with simple modification of existing hardware. Also shown is a novel use of BIST hardware for concurrent checking. Specifically proposed is a novel, dual use of concurrent checkers and built-in self-test hardware, yielding mutual advantage.
Utilization of On-Line (Concurrent) Checkers during Built-In Self-Test and Vice Versa Sandeep K. Gupta and Dhiraj K. Pradhan, Fellow, /€€E Abstract-Concurrent checkers are commonly used in computer systems to detect computational errors on-line, which enhances reliability. Using the coding theory framework developed earlier by the authors, it is shown in the following that concurrent checkers, already available within the circuit, can be utilized very effectively during off-line testing. Specifically, test time as well as fault escape probability can both be reduced simultaneously. The proposed combined scheme can be implemented with simple modification of existing hardware. Also shown is a novel use of BIST hardware for concurrent checking. Specifically proposed is a novel, dual use of concurrent checkers and built-in self-test hardware, yielding mutual advantage.
Index Terms-Built-in self-test, BIST, concurrent checking, fault-escape probability, parity prediction.
+

INTRODUCTION
HE traditional focus of VLSI design has been perform-T ance and area. Increasingly integral to the design methodology are newer concerns aimed at reducing both production and life-cycle costs through enhanced testability and field diagnosability. Also system availability is becoming a key feature of both fault-tolerant and non-faulttolerant systems. Crucial to achieving high availability is the ability to rapidly perform field diagnosis. Achieving this requires both the ability to perform concurrent as well as periodic off-line checking.
However, in the context of VLSI design, concurrent checking, on-line testing, and periodic testing have been treated in isolation. Concurrent checking has been the primary focus of system designers concerned with system availability and serviceability. On the other hand, techniques such as built-in self-test (BIST) have been the focus of VLSI test engineers concerned with product quality.
Error detecting and correcting codes which form the basic framework for the design of concurrent checking methodology have recently been shown by the authors to provide a comprehensive mathematical framework for analysis and synthesis of a large variety of BIST methodologies [6], [17] . This important new linkage provides a unified framework for the development of an integrated methodology for the design of concurrent on-line checking circuitry and off-line BIST.
Although many present day VLSI design methodologies have integrated design-for-testability (DFT) techniques, the design and placement of concurrent checkers as well as BIST support logic is still carried out in an ad hoc manner. In fact many BIST designers treat the outputs of concurrent checker in a manner similar to the other outputs of the CUT. In such BIST schemes, the outputs of concurrent checkers are compressed along with the response at the other circuit outputs. This leads to a significant loss of information about failures that are detected by the concurrent checkers. The design of concurrent checkers and BIST hardware individually, without consideration of any potential interaction, is obviously not tailored to efficient sharing of the available silicon area. The other important drawback of treating these two aspects independently is that the overlapping information is not used to realize potential performance improvements as illustrated by the above scenario.
Demonstrated clearly in this paper is that the BIST design can greatly benefit from the use of available concurrent checkers and vice-versa. Our main goal is to develop theory and design techniques to unify these design methodologies, taking advantage of potential interactions.
Previous approaches include the self-verification scheme [20] and the UBIST scheme presented in [15]. This paper provides an integrated coding theory based framework for studying the dual use of concurrent checkers for operational fault-detection and BIST. Derived from earlier coding theory, this paper has grown out of work found in [17] . In [17], a technique was presented that eliminates all aliasing in signature analysis. This technique relied on designing a signature analyzer, tailored to the given good circuit response. Utilizing those quotient bits normally discarded during signature analysis demonstrated aliasing-free signature testing can be achieved. Because the length of the signature in this enclosure can be impractically large, what is presented now can be viewed as a practical technique to achieve near-zero aliasing.
Specifically proposed is a self-test scheme combining parity (code) checking and h4ISR compression. In the proposed scheme, the circuit augmented with parity (code) predictors is not required to be fault-secure (as in [20] ). It is shown in the following that if parity (code) predictors are already available in the circuit, then they can be easily utilized to improve the quality of the off-line self-test. On the other hand, if the circuit does not have concurrent checking, then parity (code) predictors can be added to the circuit to implement the proposed BIST methodology In such cases most of the extra circuitry added for self-test is the parity prediction circuitry. Hence, the proposed BIST methodology uses circuitry that can be utilized even during normal circuit operation to perform concurrent checking. This is very useful since it can thus be utilized to enhance system reliability, providing the main advantage of the proposed scheme. Fault-escape probability in the proposed scheme is first studied for various error models. The general framework presented in [12] , [17] is used to compute fault-escape probability for the proposed scheme. Then the effectiveness of the proposed scheme on read only memories (ROM) is studied and compared with the effectiveness of the output data modification (ODM) scheme for ROMs [24] . Parity predictors are then synthesized for benchmark circuits, in order to study the area overhead that is due to parity predictors. Also, the synthesized circuits are used to obtain error models; that are used to compute fault-escape probability for the proposed scheme. Also proposed here is a novel use of BIST hardware for concurrent fault detection.
PROPOSED SELF-TEST SCHEME
Consider an m-output circuit. Let I be the number of tests applied to the CUT. The response of the CUT contains I mbit symbols. In BIST, this response is shifted into a signature analyzer, to obtain the signature which is compared with the good circuit signature after all tests are applied. The following evaluates the effectiveness of testing in an environment where built-in concurrent checking mechanism forms an integral part of the self-test.
Assume that m-bit outputs of the CUT are encoded as codewords in some code C(m, k) where m is the block length and k is the number of information bits. For example, if the output of a module is coded using an even parity code, the m outputs carry m -1 bits of data and one bit of parity. The output of the CUT is checked by a parity checker during normal operation. What is proposed here is monitoring the output of the concurrent checker during the self-test as well as shown in Fig. 1 . In many practical implementations of BET, the output of existing concurrent checkers are treated same as the other CUT outputs and compressed along with the other outputs. Hence, very useful information can be lost leading to significantly higher fault escape probability as is demonstrated by the following analysis. Now consider the proposed self-test strategy. m-bit response to each test applied to the CUT is checked by the concurrent checker, to determine if the test response is a codeword in the concurrent checking code C(m, k). Such checking is usually done only during the normal operation of the circuit. However, in the proposed scheme this checking is carried out during self-test as well. Thus the m-bit output is shifted into both the signature analyzer and the concurrent checker simultaneously (see Fig. l) , for all 1 test vectors. If any of the test responses is found erroneous by the concurrent checker for the C(m, k ) code, then the CUT is determined to be faulty immediately, with no further testing necessary. On the other hand, if all I m-bit output responses constitute proper codewords in the C(m, k ) code, then no error will be indicated by the concurrent checker. In such cases, on completion of testing, the content of the signature analyzer is compared with the good circuit signature. In the proposed scheme, a fault is not detected if the Concurrent checker fails to detect the fault during self-test and the fault also causes aliasing in the signature analyzer. The probability of this event is defined as fault escape probability. The advantage of the proposed scheme is two fold. Firstly, test time can be reduced because testing can be halted as soon as the concurrent checker detects the fault. More importantly, fault escape probability is drastically reduced. (2, K) , corresponding to the 2"-ary compressor. Multiple-input signature registers (MISRs) are typically used for data compression. MISRs with primitive feedback polynomials are commonly employed. MISRs with primitive feedback polynomials can be represented by @(x) = x + a over GF (2") where a is a primitive element in GF(2"). It has been shown in [17] that in such cases the set of error vectors that can cause aliasing in MISR compression alone are characterized by a linear code. This set, called the aliasing code AC, is a distance-2 (I, I -1) maximum distance separable (MDS) code over GF(2") for m-output CUTS whose response is compressed by an MISR.
Let AC' be the set of the error vectors E(x) that escape both the concurrent checking and the signature analysis. Clearly the set AC' is a small subset of aliasing code AC defined in [17] for M E R compression alone. EXAMPLE 1. Consider a 2-output circuit. Let three tests be applied to the circuit. Let the response be compressed by a MISR with a binary feedback polynomial significant reduction in fault-escape. As shown later, the proposed scheme also allows the design of the BIST hardware in such a way that it can also be utilized effectively during the normal circuit operation to perform concurrent checking. Such a feature provides the potential for use in the design of fault tolerant circuits.
Note that this corresponds to compression using a GLFSR over GF(2') with the feedback polynomial @(x)= x + a, where a is the primitive element in GF(2'). The aliasing code which corresponds to this is the RS (3, 2) code over GF (2') whereO=(0,0),1=(0,1),a=(l,0),and~=(1,1).
Let us now assume that the 2-bit output of the CUT constitutes 1-bit of data and 1-bit of parity. If the parity is even, then the circuit is a duplex circuit with two identical outputs. Assume that the CUT is being tested by the test vectors given above. Let the good circuit output be R(x) = px' + p. Self-test in the proposed scheme is performed as follows. The response to each test vector is checked using the parity checker (available in the next module that receives these outputs as inputs) while simultaneously being shifted into the MISR. Should the output response to any test not have even parity, the CUT is immediately determined to be faulty by the concurrent checker. Only in the case when all the responses have proper parity, one compares the signature with the good circuit signature. The set AC' which represents the error vectors that will escape both concurrent checking and signature analysis for the proposed compression scheme is:
Using a concurrent checker significantly reduces the number of nonzero error patterns that can cause faultescape (from 15 to just 1) since error patterns in AC that are detected by the concurrent checker are not included in the set AC'. Since this subset AC' of AC is much smaller than AC, the fault-escape probability can be drastically reduced.
The framework proposed here is general and accommodates any error model, provided the probability of undetected error can be computed for the corresponding faultescape set AC'. Any error model that assumes that the errors are independent over time [17] can be used. For any given error model, the complete weight enumerators of shortened/extended Zm-ary codes can be used to compute the probability of undetected error of the AC'.
Concurrent checkers are commonly used in many systems. They detect operational errors and enhance reliability. Monitoring the checker during self-test can be accomplished with minimal overhead. The potential exists for
FAULT-ESCAPE PROBABILITY
A fault is not detected when the concurrent checker fails to detect it during self-test. Subsequently, the signature must also fail-to-detect the fault. This probability is defined as fault escape probabiIify. The following studies the fault escape probability for the proposed scheme, under various error models 1121, [17] .
Symmetric Error Model
Given an m-output CUT, the symmetric model assumes that the probability that a test vector t, will cause an error at one or more circuit output is p. The probability that in presence of the fault no error occurs at any CUT output is hence (1 -p). It is further assumed that when an error is caused at CUT output(s), the m-bit response r/ of the faulty circuit can assume any of its 2"' -1 erroneous values with equal probability p/(2" -1). The following derives the fault escape probability under this error model.
Let the m outputs of the CUT be available as the codewords in the linear code C(m, k). Let AC(2, I -1) represent the MDS code which constitutes the aliasing code for the MISR. We assume that an m-bit MISR with a primitive feedback polynomial is used as a compressor. It is shown in [12] , [17] that such compressors can be represented by the feedback polynomial @(x) = x + a over GF (2") . The aliasing code AC(2, I -1) is the MDS code for any 2 [17] .
Let AC' (2,Z -1) be the subset of the codewords in AC(1,Z -l), consisting only of those codewords each of whose 2"-ary symbols are codewords in the concurrent checking code C(m, k). Let A,,m be the number of codewords of Hamming weight w in the MDS code AC(2, 2 -l), with minimum distance d = 2. Let Ai,w be the number of codewords of weight w in the code AC'(1, I -1). Define Ai,w as
The following shows that the Ai,w given by the above equation is equal to the desired A;,w if the CUT outputs are compressed using m-bit MISR with feedback polynomial @(x) = x + 1 over GF (2") . Note that, this MISR compression corresponds to independent parity compression at each LEMMA 1. Let c,c, . . . c,, be a codeword in the distance-2 (MDS) code AC(2, 2 -1) over GF (2") , such that, cI-l = ~~~~ c, . Then Ai,, = A;,w, given by (1).
PROOF.
The following relations are obtained using the arguments used in [4] to determine the weight distribution CUT output.
of MDS codes. The fact that, each symbol in the codewords in AC' must be a codeword in C(m, k), is used to obtain a reasonable bound. Here the equality is due to the fact that in this code, as long as the information symbols are codewords in the binary code C(m, k), the check symbol is also a codeword in the code C(m, k).
A;,w given above has been derived by using these relations. A;,w is given by ( L)M2,w, where M,,m is the value that satisfies the above relations. Hence, for the special case of independent parity compression at all circuit outIn general, the A;,w given by (1) is an approximation for
The fault escape probability can be shown to be given by puts, A;,w = A:+ given by (1). the weight distribution A;,w of the aliasing code AC'.
where p is the probability that any test fails, under the 2"'-ary symmetric error model. However, only an approximate value to is available. The following Theorem states that the approximation A;,w can be used in the above expression to compute an upper bound on the fault-escape probability under the proposed compression scheme. 
Let Hence, using the value of PJp) given by (4) ' (13) Using (11) in the above relation, it can be shown that
for all 2 I j I 1 -1. This is satisfied if 1 P 1 I I-w+l .
2( zm -1)
Hence, under these conditions, P,e,,,(p) 2 P,Jp). EXAMPLE 2. Consider an 8-output circuit. Let the output of the circuit be coded using an even parity code. Let the output response be compressed using an 8-bit MISR with binary primitive feedback polynomial, @(x)= x8 + x4 + x3 + x ' + 1. The fault-escape probability can be computed for various test lengths using the Z8-ary symmetric error model and the above results. Fig. 2 shows the aliasing probability for MER compression (labeled only MISR) and the fault-escape probability for parity checking as well as for the proposed scheme (for a fixed p ) labeled MSR + CC. The parity checking scheme is labeled CC only. Note that the fault-escape probability is significantly lower for the proposed scheme. Three important observations may be made:
As the fault-escape set for the proposed scheme is a subset of the aliasing code for MISR, it can be proven that the fault-escape probability of the combined scheme will always be lower than the aliasing probability for MISR alone. Although aliasing probability for the MISR reaches a steady state value as a function of test length, the fault-escape probability for the proposed scheme con-tinues to decrease with increasing test length. This is due to the fact that as more tests are applied, the probability that the fault present in the CUT will cause an error at the output of the CUT that is detectable by the concurrent checker increases. Since faults detected by the concurrent checker for any test vector can never escape the proposed BIST technique, faultescape probability decreases. 3) Note that the fault-escape probability for the proposed scheme decreases at a rate similar to that for the concurrent checking alone. However, the faultescape probability for the proposed scheme is a product of the concurrent checking fault-escape and the MISR compression aliasing. This means that the proposed scheme combines the advantages of MISR compression (aliasing of the order of 2-" for a small hardware overhead) with the decreasing nature of the concurrent checking fault-escape.
This downward trend in the value of Pp with increasing I is of fundamental importance because it implies asymptotically zero fault escape. This is because, as more and more tests are applied, the probability that the fault will cause an error which will continue to be a proper codeword reduces to zero asymptotically. Even for small test lengths, the combined scheme is of major significance. In practice this will allow one to reduce test length (time) drastically. Also another practical implication is that concurrent checking schemes become more attractive to the VLSI designer now. Due to this dual use one may find circuit augmentation with a parity or residue checker a practical proposition. In the following we use the results derived in [ E ] to develop a general method for computing exact fault-escape probability for any error model and any subcode.
General Framework
The framework developed in [12] can be used to study fault-escape probability for a general error model. In the following we illustrate that this framework can be used to ~ 67 compute fault-escape probability for the proposed scheme assuming that the CUT outputs are coded using even parity code. (Note that the framework is general and can be used for any C(m, k) code. Even parity code is used in the following only for illustration.)
Consider an m-output CUT. Let the outputs be coded using an even parity code. Let I tests be applied and the response be compressed using an m-bit h4ER with primitive feedback polynomial. In the following we shall illustrate how the framework can be used to compute fault-escape probability for different error models, with and without concurrent checking.
In the general error model, the probabilities of the error taking any of its 2" possible values {wo = 0, wl,. . . , w2" -1 1 are given by lpo, pl,. . . , p2,,, -, I , respectively.
SYMMETRIC ERROR MODEL. Under this error model, for a randomly chosen vector, the error e, can take any of its 2"-1 nonzero values with equal probability pl(2" -1) and the value 0 with a probability 1 -p . Then the corresponding probabilities for the general error model are:
However, if MISR compression is used along with an even parity check, then the probability corresponding to the odd-weight error terms becomes zero. Hence, we have:
INDEPENDENT ERROR MODEL. In this case, the probability of getting a particular error pattern of weight w is given as p"(1-p)"-". Let wj = weight($ 0 I j 52" -1. Then, for MISR compression, Even in this case, when the even parity checker is used during BIST, the probability corresponding to all the odd weight errors becomes zero and is given as:
if w i is odd. Table 1 illustrates the probabilities for a 4-output circuit.
Once these probabilities (po, p,, . . . , pZm-,) are computed (as outlined in [12] ) the framework developed in I121 can be used to compute the exact fault-escape probability for the scheme. The following example compares the aliasing for MISR compression with the fault-escape probability for the proposed scheme, under independent error model. EXAMPLE 3. Consider an 8-output CUT. Let the errors be independent at the outputs of the CUT. If the response is compressed using an 8-bit MISR with primitive feedback binary polynomial Hx)= x8 + x4 + x3 + x2 + 1, then the fault-escape probability, for various test lengths is as shown in Fig. 3 .
TABLE 1 PARAMETERS FOR VARIOUS ERROR MODELS
Note that, even for this error model, the fault-escape probability decreases with increasing test lengths, for the proposed scheme.
Hence, we see that the aliasing probability for MtSR compression can be reduced drastically if concurrent checkers are used during BIST. In the following this technique is applied to ROMs. Note that in this case, the errors at the output are assumed to be independent. Finally, some experimental results are discussed for random logic. Fig. 3 . Fault-escape probability for 8-output CUT for independent error model.
In [24] , a self-test scheme was presented for ROMs. This was a special case of the method presented in [23] where the output of the CUT is compressed using an MISR during self-test. An Output Data Modifier (ODM) is designed (see Fig. 4 ) which generates the value that is shifted-out of the most-significant MISR bit, after application of each test. In the case of ROM, ODM is easy to design since it just requires an additional column in the ROM. Thus, if the output of the most-significant MISR bit is compared with the output of the extra ROM column, a mismatch indicates an error. The area overhead for the scheme is moderate for ROMS with 2 16 bit words, since the scheme requires only one extra column.
The scheme provides a mechanism for 'reducing faultescape drastically. However, errors at different outputs (including the extra ODM column) can cancel each other, causing an error escape [l] . (In [24] , the MISR is reconfigured and tests are applied again to reduce this probability.) The probability of error cancellation can be computed for any given ROM organization. A recursive procedure has been developed to compute this probability assuming the errors to be independent. The implementation of the proposed scheme for ROM is as shown in Fig. 5 . In this case the extra column is programmed such that any ROM word (with the extra bit) has even parity The output of the ROM is now compressed using an MISR and also checked using a parity checker. Note that the area overhead of the proposed scheme is the extra ROM column and the concurrent checker. The size of the concurrent checker varies only linearly with the size of the ROM word and is independent of the number of words. Hence, for ROMs with large number of words, the area overhead of the proposed scheme would be close to that for ODM. In the following it shall be shown that the combination of concurrent checking and MISR compression can provide a self-test scheme with a comparable overhead and comparable fault-escape probability. Note that, in the following analysis, errors are assumed to be independent which is not entirely accurate for the regular and dense ROM structure. Hence, the following example should only be used to illustrate that the fault-escape probabilities for the two schemes are very small and close to each other. The fault-escape probability for the proposed scheme can be computed using the formulation presented in the previous section. The following example illustrates the faultescape probability for both schemes. EXAMPLE 4. Consider a ROM with 32 words. Let each word be 8 bit wide. Hence, the ROM has 32 rows and eight columns. An extra column is added for both the schemes as outlined above. In the ODM case, an 8 bit MISR is first chosen. Also, a test pattern generator is chosen to apply a specific input sequence to the ROM. The ROM and the MISR are then simulated and the output of the most significant MISR bit is monitored. The extra ROM column is programmed in such a manner that for the given input sequence, the output of the extra column is the same as the MISR output.
On the other hand, in the proposed scheme, the extra column is programmed such that parity of all the words is even. A 9 bit MISR is then selected to compress the output response. The fault-escape probabilities for the two schemes are shown in Fig. 6 .
Though the fault-escape probability for the proposed scheme is lower, the main advantage of the proposed scheme is qualitative. The main issue is the utilization of the redundancy designed for the self-test scheme. The extra hardware incorporated in the ODM scheme is useful only for self-test. However, in the proposed scheme the extra hardware incorporated for self-test is useful even during normal circuit operation, when it can be used to check the parity of ROM output. This would reduce the latency of errors and help in error containment and diagnosis. 
EXPERIMENTAL RESULTS
The framework developed above can be used to analyze fault-escape probability in the proposed scheme if the circuit has built-in code predictors. However, if the CUT has no inherent redundancy, then one needs to synthesize parity predictors to augment the circuit, as shown in Fig. 7 . These circuits, and suitable parity predictors, were synthesized to study the hardware overhead required for parity prediction. Also, the synthesized circuits were used to obtain general error model parameters for circuits with parity predictors. The general error model was then used to estimate fault-escape probability for the proposed scheme. 
Synthesis
Synthesis benchmark circuits, from the espresso library (a part of UCB CAD Tools), were used for this study. Multilevel realizations of the circuits were obtained from the two level and/or description by using misll, also a part of UCB CAD tool set. The two-level description of each circuit was used to obtain a two level description of the parity predictor required for the circuit. The parity predictor and the CUT were then synthesized as multilevel logic in two different ways. SEPARATE CUT AND PP REALIZATION. In this case, the CUT and the parity predictor were synthesized in multilevel form as two separate circuits. Then the two were used as shown in Fig. 7 to obtain the composite circuit with parity predictor. Note that in this circuit, the CUT and parity predictor did not share any logic. The advantage of this is that any single fault in the parity predictor logic is detected by the output parity checker, since it can affect only one output. AUGMENTED CIRCUIT. Alternatively, the two-level descriptions of the CUT and PP were combined and the augmented circuit was synthesized. The resulting multilevel realization was that of the CUT and parity predictor together. The main difference was that in this case the two (CUT and PP), shown separately in Fig. 7 , actually shared logic. The circuit so synthesized was found to (typically) have less area than the separate and independent circuit described above.
The results of the synthesis are shown in Table 2 . The table shows the number of inputs, outputs (in o r i p a l circuit) and the areas of the original, augmented, separate, and independent circuits. The area f+re is a weighted gate count o f the circuit where gates with higher fan-in are assigned higher values.
Fault-Escape Probability
The error model for the synthesized circuit was obtained using statistical technique outlined in [12] . This model was then used with the general framework to compute faultescape probability for the MISR compression and the proposed scheme. Two representative curves are shown in Figs. 8 and 9. Note that, the fault-escape probability results obtained on these circuits are as expected. The MISR compression aliasing stabilizes at its asymptotic value. However, the fault-escape probability continues to decrease with increasing test lengths for the proposed scheme. Note that the area overhead for the simple concurrent checkers required by the scheme can be low. The area overhead for the augmented circuit can be less than 20%. The self-test fault-escape probability is decreased drastically by using the concurrent checkers. Also, the concurrent checkers can be used for checking during normal circuit operation. Periodic self-test can be used to detect the faults that might escape the simple parity predictor.
Note that the periodic self-test using MISR to supplement concurrent checking is useful even if the CUT is faultsecure and self-testing. This is due to the fact that unless all the input combinations are applied to the CUT, the faultsecure property could mask the error. However, between self-test sessions, another error could occur and make the circuit nonfault secure, and the error might escape the concurrent checker during self-test.
UTILIZING BIST HARDWARE FOR CONCURRENT CHECKING
In the following, we discuss how BIST hardware in turn can be utilized to perform concurrent checking if the output of the circuit is available in coded form using a code C(m, k).
Below it will be shown that if the MISR compressing the output of the CUT is designed taking into account the fact that the outputs are available in coded form during normal circuit operation, then the content of the MISR will always remain as a codeword in the C(m, k), in absence of faults. We will call these MIS& as Codeword preserving MISR (CMISR). These MISRs have been earlier proposed for a different application by Pradhan et al. [16] . The advantage of designing MSRs in this manner would be that the contents of the MISR can be periodically checked to detect operational on-line errors and recovery can be attempted in the event of error detection. This would be useful during diagnosis when the output of the circuit can be stored in the MISR and periodically flushed to be checked by the external checker. Let C(m, k ) be cyclic code with generator g(x), of degree m -k. Let p(x) be a degree k polynomial. Let the MISR be designed with feedback polynomial
The following shows that as long as the input to the MISR remains a codeword in the code C(m, k), the content of the MISR continues to be a codeword in the code C(m, k)-assuming that the initial content of MISR to be a codeword. In the sequel we will refer to these MISRs as CMISRs to distinguish these from standard MISRs. Let S,(x) be the initial state of the CMISR and let S,(x) be the polynomial representation of some codeword in C(m, k). Let S;(x) be the contents of the CMISR after a shift, assuming that the input to the MISR is zero. Then,
This implies that for some q(x),
Since S,(x) and @(x) are both divisible by g(x), it implies that S;(x) is also divisible by g(x). Hence, S;(x) is also a codeword in the code C(m, k). Thus any cyclic shift of the CMISR continues to produce codewords in the same code. Now let the input to the CMISR, r,(x), constitute a codeword in the code at every shift. Thus one has S,(x) = s;(x) + r&).
It is clear that the contents of the CMISR S,(X) is also a codeword, [7] , because sum of two codewords is a codeword.
By repeating this argument, it can be shown that the contents of the MISR continues to be a codeword as long as: 1) the initial state of the CMISR is a codeword, and 2) each input to the CMISR is a codeword in the code.
The following example illustrates this process.
EXAMPLE 5.
Consider a CUT with 7-bit output. Let us assume that the output of the CUT is coded using a (7, 4) cyclic Hamming code with the generator polynomial g(x)
= x3 + x + 1. Hence, if any test vector is applied to a good circuit, then the output will be a codeword in the given (7, 4) Hamming code. Consider the CMISR in Fig. 10 which has the feedback polynomial $(x) = p(x)g(x) where, p(x) = x4 + x + 1 is a primitive polynomial and g(x) = x3 + x + 1 is the generator polynomial for Hamming code.
Thus, @(x) = x7 + x5 + x3 + x2 + 1, as shown in Fig. 10 . It can be seen that as long as the inputs to this CMISR are codewords in the (7, 4) Hamming code, the contents of the CMISR will always be a codeword in the same code. For example, if the response of the good circuit for five consecutive inputs during normal operations is (1101000, 1110010,1011100,0111001,1100101), then the signature in the CMISR will be 1001011, which is a proper codeword in the (7,4) Hamming code generated by g(x) = x3 + x + 1.
Now consider a fault in the circuit. Let this fault produce an error for the fourth output. Let the outputs corresponding to the five inputs now be: (1101000, 1110010, 1011100,0011001,1100101), where the fourth output vector is not a codeword. Now the final content of CMISR is 1011011, which is not a codeword in the (7,4) code. This will be detected when the contents of the CMISR are flushed out and checked. This example demonstrates a new aspect of CMISR signature analyzer. What it shows is that the CMISR can not only be used for test response compression during BIST, but also as a concurrent checker for operational diagnosis. This is useful, for example, when the checker for the code is not available on the chip. Then, the output of the circuit can be clocked into the CMISR. Every few cycles, the contents of the CMISR can be shifted-off chip and checked, using a checker. If the checker detects an error, then action may be taken for diagnosis and recovery.
Also, another use of the CMISR would be to test any checker which is used for monitoring errors in circuit outputs during normal operation. Consider the configuration shown in Fig. 11 . The outputs of the circuit are loaded concurrently in a CMISR as well as sent to the next stage. These outputs which are now placed in CMISR can be checked by a self-testing checker for any errors. A problem that arises is how does one guarantee that any fault within error detection logic itself can be detected as well. One approach is to design a so-called totally self-checking checker. However, this requires that the self-checking checker is excercised through all posible input code vectors periodically. It can be observed that the CMISR cycles through all possible nonzero code vectors if the inputs are held constant. If all 0 vector is also needed as a test then the folloying technique can be used to insert the all 0 vector in the cycle in CMISR. Here we can add a decoding logic that recognizes any nonzero code vector and extra feedback connection can be inserted which are only activated for just one cycle to produce an all 0 state. In the following cycle the regular nonzero state is restored by second decoder logic as shown below. Consider the CMISR shown in Fig. 12 which is a modification of the CMISR shown in Fig. 10 with these additional decoder logic and feedback. can be used to restore the CMISR to 0110100 state as it would be the successor state 1101000 in the original design. Thus, one can perform periodic testing of the error detection logic by holding the functional circuit output constant.
However, if this is not possible one can probabilistically guarantee a degree of confidence. This can be seen by observing during the normal operation given sufficient time interval the probability that all possible code vectors applied to the checker to 1.
CQNCLUSION
Concurrent checkers are commonly used in computer systems to detect errors during computation. In this paper, it is shown that if the concurrent checking is used along with MER compression for off -line self-test, then the fault-escape probability can be reduced drastically. Simple modification to the existing self-test control can easily implement the proposed scheme. The results presented in this paper makes the extra hardware required for concurrent checking more attractive in view of its impact on self-test. A key point of this paper is as follows: that the hardware required to achieve low faultescape probability in self-test environment, if designed as a combination of a concurrent checker and signature analyzer, can be more cost-effective than the design using only signature analyzer. Thus one can get concurrent checking for operational faults as a byproduct of the proposed approach without additional hardware complexity.
