Abstract-In this paper, the authors propose a new transparent built-in self-test method to test in parallel multiple embedded memory arrays with various sizes. First, a new transparent test interface is designed to perform testing in the normal mode and to cope with test interrupts in a real-time manner. The circular scan test interface facilitates the processes of both test pattern generation and signature analysis. By tolerating redundant read/write/shift operations, we develop a new march algorithm called TRSMarch to achieve the goals of low hardware overhead, short test time, and high fault coverage. It can be proved that TRSMach can detect all stuck-at faults, all transition faults, and each coupling fault occurring in different words. For each coupling fault occurring in the same word, depending on the coupling type and effect, it can be detected or its detection probability can be high as more transparent processes are executed. TRSMarch can be easily extended to deal with more faults such as single-cell read destructive faults and read destructive coupling faults.
I. INTRODUCTION
Due to the improvement of very large scale integration (VLSI) fabrication technology, many components can now be condensed onto a single chip. The fabrication density of regular structure circuits, such as memory devices, is especially high and they are very vulnerable to defects. Although memory devices contained on a single chip have been thoroughly tested before shipping to customers, they might be defective during normal operation. Concurrent testing can deal with defects occurring during normal operation based on some form of information redundancy. Unfortunately, large hardware overhead will be introduced due to the logic circuit required to generate the redundant information. Moreover, memory circuit speed will be degraded and every memory access will suffer performance penalty. Thus, a more cost-efficient solution is to use transparent testing by which memory devices on a chip are tested periodically. The basic requirement of transparent testing is that the memory contents must be restored to the initial state, so that normal operation can be resumed.
The pioneering work on transparent memory testing is proposed in [1] , where only a very limited number of read/write operations are applied to each cell in the memory under test. New test patterns are generated by XOR-ing the memory contents with a set of prespecified test patterns. This guarantees that memory contents will resume to the initial state and that digital signature analysis can be achieved as well. Of Publisher Item Identifier S 0278-0070(02)02845-2.
course, the fault coverage cannot be high due to the simplicity of test patterns. In [2] and [3] , the technique of transforming any nontransparent RAM test algorithm to a transparent one has been developed. The basic idea is to complement the memory contents an even number of times such that the memory state can be resumed while fault coverage can be maintained. A new signature prediction concept is used to verify the correctness of test responses, though extra test time must be invested to the signature prediction phase. Based on the concept that a twisted ring counter with n stages has a constant cycle length 2n for any initial state, it has been applied to transparent built-in self-test (BIST) of memory devices in [4] . It was shown that the test sequences have good fault coverage for simple faults (e.g., stuck-at faults) and for pattern sensitive faults (though there is no guarantee). In [5] , a combinational circuit that generates successive bits of (n; k)-exhaustive test patterns is developed to detect single k-coupling faults. The BIST scheme used a modified version of [2] and [3] to make the applied tests transparent. It can be found that the test time will be quite long if k is large. To eliminate the signature prediction phase in [2] and [3] , a symmetric transparent BIST method is proposed in [6] . This method adopts a reciprocal linear feedback shift register (LFSR) and a reversed-order data to obtain a predictable signature. The hardware cost and the fault coverage of the new method remain comparable to those of [2] and [3] with test time highly reduced.
In this paper, we develop an efficient transparent BIST method that is able to concurrently test many SRAM arrays of different sizes. First, a circular scan chain test architecture is designed by adding a very limited amount of hardware to the original input/output (I/O) port of each memory array under test. The circular scan interfacing technique greatly simplifies the tasks of transparent test pattern generation, of digital signature analysis, and can deal with various memory array sizes efficiently. Based on this test interface, a new march algorithm (TRSMarch) is proposed to generate test patterns based on memory contents left by normal operation. Digital signature analysis is achieved by comparing outputs of two adjacent march elements without incurring extra test time and hardware overhead. All memory arrays share a common BIST controller and can be tested in parallel even if they have different array sizes. This is achieved by tolerating some redundant test operations in smaller arrays without losing any fault coverage. Another special feature of this research is that the proposed transparent BIST method can deal with test interrupts in a real-time manner. That is, if a test interrupt occurs during the transparent BIST process, the test circuit can be easily reconfigured such that the memory contents can virtually be restored immediately and normal operation can be continued.
Section II provides related background for the memory fault models used in this work. A new circular test interface is introduced in Section III, and the TRSMarch algorithm is discussed in Section IV. Section V gives the detailed design to deal with test interrupts in a real-time manner, and fault coverage is analyzed in Section VI. Section VII presents the experimental results obtained by computer simulation. Detection of more advanced faults such as read destruction faults and read destructive coupling faults by adding more operations to TRSMarch is discussed in Section VIII. Finally, the conclusion is provided in Section IX.
II. BACKGROUND
The major component of each SRAM module is the memory cell array which is implemented using SRAM. Thus, an appropriate 0278-0070/02$17.00 © 2002 IEEE memory cell array fault model must be used to deal with the SRAM faults [8] . To control the test budget, the fault model used cannot be too complicated. Generally, stuck-at fault (SAF), transition fault (TF), coupling fault (CF), and sequential fault (SF) are considered as many commercial tools do. Neighborhood pattern sensitive faults and inversion coupling faults are not considered, since neighborhood pattern sensitive faults are not important for SRAMs while inversion coupling faults do not exist in SRAMs. Now, we give detailed explanations about the memory fault models used in this work. 1) SAF: The logic value of a memory cell is always stuck-at 1 (SA1) or 0 (SA0).
2) TF: A cell fails to undergo a 0 ! 1 transition (TF 0 ! 1) and/or a 1 ! 0 transition (TF 1 ! 0).
3) CF: In this case, a write operation in one cell i can influence the value of another cell j, and cell i is called the coupling cell whereas cell j is called the coupled cell. As in [8] , we consider two types of coupling faults:
• State coupling fault (CFst): A coupled cell is forced to a certain value x if the coupling cell is in a given state y;
• Idempotent coupling fault (CFid): A 0 ! 1 and/or 1 ! 0 transition in the coupling cell forces a certain value in the coupled cell. Recently, it has been found that read operations can be destructive in SRAM cells [9] . For example, a read operation performed to cell x may change the content of x and return the inverted logic value, and the fault is called a read destructive fault (RDF). It is also possible that a read operation performed to cell x changes the data of x, but the operation returns the correct logic value. This fault is called deceptive read destructive fault (DRDF). Other read-sensitive faults include incorrect read faults (IRF) and random read faults (RRF) [9] . There exist read-coupling faults between SRAM cells as well. For example, a read operation applied to the victim cell (v-cell) returns an incorrect value if the aggressor cell (a-cell) is in a certain state, and the fault is called incorrect read coupling fault (CFir). If a read operation applied to the v-cell causes a transition in the v-cell and returns an incorrect value when the a-cell is in a given state, then the read-sensitive fault is called a read destructive coupling fault (CFrd). Other read-sensitive coupling faults include random read coupling faults (CFrr) and disturb coupling faults (CFds) [9] . We will show in Section VIII that single-cell read-sensitive faults and read-sensitive coupling faults can be detected by the proposed method, if more operations are added to the march algorithm used in this work.
III. A NEW TRANSPARENT TEST INTERFACE
The basic requirements for transparent BIST are: 1) the test process cannot destroy the memory contents when it is finished or interrupted; 2) the test process must switch to the normal execution mode within a very short time when a test interrupt occurs; 3) the fault coverage must be high enough to justify the extra hardware added; 4) the test time must be short enough to prevent the test process from being excessively interfered by test interrupts; and 5) the hardware overhead must be as small as possible to justify the implementation cost. Obviously, the requirements for transparent BIST are more restrictive than those of traditional memory BIST methods for manufacturing testing.
To deal with the difficulties of implementing an efficient transparent BIST technique, we propose a new test interface which is able to meet all the basic requirements mentioned above. Based on the memory contents left by normal operation, the test interface can read each word and then shift the entire word bit by bit for testing. The test interface can also generate new test patterns based on the current memory contents to detect the majority of memory faults. Further, the test interface is able to support test response evaluation effectively. As shown in Fig. 1 , the proposed test interface is designed by adding several more multiplexers to the original I/O port of each memory array under test. It mainly implements the functions of circular chain shifting, test pattern generation, and signature analysis.
Circular chains are used to test multiple memory arrays with various sizes in parallel, and each memory array has its own circular chain. Based on this scheme, each word of a memory array can be latched and scanned along the circular chain as shown in Fig. 1 . The proposed test interface keeps shifting the entire scan chain for test evaluation (without writing back the scan chain content) once a word has been read. Since all memory arrays are tested in parallel, the number of bit shifts for each memory read operation equals the maximum word width c of all memory arrays. Thus, for each memory array with smaller word width, the memory content latched into its circular scan chain might circulate several times. As will be shown later, this test concept will greatly ease the signature analysis process. In summary, the major advantages of this circular scan chain scheme are: 1) no memory access will be involved once a word has been latched for scan testing, which greatly accelerates the test speed; 2) the minimum level of memory array involvement during testing enables the transparent BIST process to be switched to the normal execution mode in a very short time; and 3) the circular test architecture allows memory arrays with different word sizes to be tested in parallel.
In conventional memory BIST methods, generally, new test patterns are generated by the test controller and routed to each memory array either in serial or in parallel. However, for transparent BIST, it is preferred to have the test interface generate test patterns locally to avoid test data routing from the test controller to each memory array. In the proposed test interface, test patterns are generated by selecting Q as the input to each memory cell as shown in Fig. 1 , so the transition of Q ! Q and Q ! Q will be exercised in each memory cell.
Thus, based on the memory contents left by normal operation, new test patterns are generated by complementing each memory word several times. In fact, the test patterns are random in nature since every different normal operation will leave different background data. Although the test patterns generated are simple, their test power is still very strong as will be demonstrated later.
Unlike manufacturing testing, test patterns used by transparent BIST are not known until the test process is started. Thus, it is impossible to calculate digital signatures and to store the golden values on the chip for test response analysis. The best solution to this problem is to design a digital signature analysis method which does not have the necessity of determining the golden signature value beforehand. Fortunately, our test interface can achieve this goal by just adding an inverter to the circular scan chain, so the multiple-input signature register (MISR) can receive inputs with or without complementing the data scanned out as shown in Fig. 1 . The basic idea is to read all memory words (e.g., in march element 1) and compress their values into digital signature S1 by scanning the word contents (bit by bit for each array) into the MISR. Then, the memory words are complemented by the test interface circuit and their contents are read (e.g., in march element 2) and shifted into the MISR again to generate signature S2. But this time, test responses are complemented before they are shifted into the MISR. Thus, S1 and S2 should have the same value, if the MISR is seeded the same value for both compressions and if the memory words are fault-free. Based on this new test concept, the test response evaluation process can be performed without knowing the test patterns in advance. The signature analysis method is highly related to the march algorithm used during the transparent BIST process and will be further discussed in the next section.
The relationship among the transparent test interface circuit and other components is illustrated in Fig. 2 . It can be found that all memory modules receive the same test control signals from the transparent BIST controller and that test responses of all memory arrays (one bit for each memory array) are simultaneously fed into a global MISR. The test responses might be complemented before sending to the MISR, depending on the march element executed, as will be discussed later. Instead of broadcasting a global memory address, to save the routing space we use the test clock signal to drive an LFSR or a counter (placed around the memory array) for generating the test address for each memory array. The shift clock is used to drive the circular scan chain for each memory array. Other control lines are used to control the test interface of each array for multiplexer selection and read/write selection.
IV. THE PARALLEL TRANSPARENT MARCH ALGORITHM
Based on the test interface discussed above, we propose a transparent march algorithm called TRSMarch to test all memory arrays in parallel. As shown in Fig. 3 , the test algorithm mainly contains six march elements, and no test patterns are required for memory array initialization. To simplify the test circuit, redundant test patterns are tolerated for smaller memory arrays without hurting the fault coverage as in [7] . The test algorithm does not need to use a finite-state machine to generate the test patterns, since all test patterns are converted from memory contents left by normal operation. In the following discussions, the march elements of TRSMarch will be discussed in detail. However, unlike RSMarch in [7] , TRSMarch does not allow vertically redundant operations, and the reason will be thoroughly described.
As shown in Fig. 3 , each march element includes a set of read/shift data operations and test pattern write operations. The read operation is represented by r(a i ), the shift operation is represented by s(a i ), while the write operation is represented by w(a i ). Note that (a i ) denotes the content of word i in a memory array, while (ai) represents the content of word i that has been complemented. March elements M1, M2, and M3 are executed in an ascending address order, but elements M4, M5, and M6 are executed in a descending address order. The read/shift operations are used to verify the correctness of each memory bit operation and will not change the memory contents. For each memory array, given an address, one memory word will be read into the circular scan chain. The word bits are then successively shifted out for signature analysis without destroying the memory contents. Thus, once a memory word has been read, the shift operations are continued until all bits in the scan chain have been shifted out at least once. Note that the operations are executed at all memory arrays concurrently and that the memory array with the largest word width determines the number of bit shifts.
It can be noticed that memory arrays with smaller word widths will have redundant test data shifted into the MISR for test compression. The reason for allowing redundant test data is to simplify the test circuits in that all memory arrays can receive the same test control signal for read/shift operations. Fortunately, these excessive shift operations do not have any influence on the memory contents, and the redundant test outputs can be easily handled by the MISR without damaging the fault coverage. The beauty of the circular scan chain technique is that it allows all memory modules to be tested concurrently, regardless of their different widths. Especially, digital signature analysis can be easily accomplished for arrays with smaller widths by circulating extra bits into the MISR deterministically. For the example in Fig. 4 , each word contains 4 (6) bits in memory array A (B) and march element M1 is executed. Since memory array B has word width equal 6, the number of shift operations is determined 6. Thus, two extra bits for each word in memory array A will be shifted into the MISR for test data compres- In M1, M2, M4, and M5, each march element contains two separate scans of the entire memory array. As shown in Fig. 3 , submarch element scan1 is used to generate the digital signature of the memory array, while submarch element scan2 is used to generate new test patterns by reversing each word content. Each new test pattern is generated by complementing the content of the current word under test. For example, the new test pattern for memory array B in Fig. 4 will be "101001," while that for memory array A will be "0110" by the w(a i ) operation in march element M1. For a smaller memory array, the circular scan chain might not be well aligned to generate the correct (complemented) test data. Thus, another memory read operation is required to retrieve the correct test data that will immediately be complemented and then written back into the word. It can be found that the new test pattern generation process can be easily supported by the proposed test interface locally.
We emphasize that scan1 of M1 (M4) generates the golden digital signature for M2 and M3 (M5 and M6). It is not allowed to read a word, shift the word, and write back the complemented word immediately in M1 and M4. The reason comes from the fact that writing back the complemented word immediately might change the memory content of another word (due to a coupling fault) which has not been read yet, so this fault cannot be detected (the test process treats the erroneous value as a correct value left by normal operation). Fortunately, the idea of separating into two scans for each of M1, M2, M4, and M5 will not increase the number of memory read, shift, and write operations for each march element.
The major problem of test evaluation for transparent BIST is how to determine the digital signature without knowing test patterns beforehand. As discussed above, our test interface can achieve this goal by just adding an inverter to the circular scan chain, so that the MISR can receive inputs with or without complementing the data scanned out as shown in Fig. 1 . Depending on the march element, if the memory contents have been complemented, then the test responses must be inverted before they are sent to the MISR. For the example in Fig. 5 , suppose the output response from the scan chain is "001 000" (i.e., a i ) in march element M1 [ Fig. 5(a) ]. According to TRSMarch, the test response for the same word by march element M2 is "110 111" (i.e., a i ) because data in the memory word has been complemented [ Fig. 5(b) ]. Thus, if the word does not produce erroneous test data, the signature of M1 must be the same as that of M2 ( since the test data for M2 has been complemented by an inverter before shifted into the MISR). By comparing both signatures, erroneous test responses can be observed. Here, we notice that M1, M2, and M3 (M4, M5, and M6) must have the same memory addressing order. Consequently, test output data can be verified by comparing each pair of the signatures produced by (M1/M2), (M2/M3), (M4/M5), and (M5/M6).
In RSMarch, besides horizontally redundant operations, vertically redundant operations are allowed for memory arrays with smaller word numbers (Fig. 6 ) to simplify the test control design [7] . But, vertically redundant operations must be avoided in TRSMarch because of the disparity between both march methods. In RSMarch, the vertically redundant operations do not change the memory contents for memory arrays which are excessively tested. Thus, there is no side-effect existent and this good property can be used to save test hardware. However, the w(a i ) operations in TRSMarch will cause trouble if they are excessively applied to shallow memory arrays. For the example in Fig. 6 , if vertically redundant operations are allowed to be applied to memory array A, then the first round of vertically redundant operations will complement the entire memory contents back to the original values. Further, the second round of vertically redundant operations will drive part of array A complemented while part of A uncomplemented. Thus, this side-effect makes the test pattern generation process extremely difficult to control. Fortunately, this problem can be easily solved by terminating the march operations temporarily, every time when a shallow memory array has been completely scanned by a specific march element. Thus, the testing of shallow arrays will be terminated before the deepest memory array finishes the current march element. The terminating circuit can be easily designed by using the selection line of the last word in each shallow memory array.
The TRSMarch algorithm is applied to perform transparent BIST by testing multiple memory arrays in parallel. The total test time of the algorithm depends only on the largest word width and the largest word number of all memory arrays under test. Note that the total test time is entirely independent of the number of memory arrays. The advantage of TRSMarch becomes more significant when the number of smaller arrays (either shallower or narrower arrays) increases. Analytically, TRSMarch requires 10n of memory read operations and 4n of memory write operations, where n is the maximum word number of all memory arrays. In addition, TRSMarch requires at most cn of shift operations for each march element, where c is the maximum memory width among all arrays. Thus, a total of 6cn shift operations is required from M1 to M6. To sum up, the total test time required is the sum of 14n of memory access operations and 6cn of scan chain shift operations. Besides testing all memory arrays in parallel, application of the circular scan chain also greatly reduces the test application time by minimizing the number of memory accesses during testing. We emphasize again that test time minimization is extremely important for transparent BIST. The reason comes from the fact that transparent BIST might be terminated by test interrupts, so it might never finish if the test time is too long and the frequency of test interrupts is too high.
V. TRANSPARENT TESTING AND TEST INTERRUPTS
The other special feature of transparent testing is that, unlike manufacturing test, the transparent test process might be interfered by test interrupts which can occur at any time. Since the memory contents will be changed during the execution of each scan2 submarch element, unless they can be recovered immediately, error will occur when normal operation is executed. Of course, it is possible to solve this problem by terminating the test process and executing a set of special march operations for memory recovery. However, execution of the recovering march operations is generally quite time consuming and prevents normal operation from being executed in a real-time manner. In summary, to resolve these difficulties, several requirements must be satisfied: 1) correct memory data must be maintained for normal operation and 2) the data recovery process must be finished in a very short time.
The following discussion mainly focuses on the scan2 process of each march element, since the test interrupt problem does not exist during the scan1 execution. As shown in Fig. 7 , assume that (the scan2 process of) M1 is being executed at word y of a memory array and that a test interrupt occurs simultaneously. It can be observed that memory contents in the range from word 0 to word 100 have been reversed and that this might cause incorrect results when normal operation is executed. We have also found that the data map left in the memory array is highly related to march directions. For example, M1 and M5 will leave the same memory map as shown in Fig. 8 where the upper (lower) part of the memory array has been complemented (unchanged). Similarly, march elements M2 and M4 will leave the same memory map, if they are interrupted. To guarantee the proper execution for normal operation, we have enhanced the transparent test interface, as shown in Fig. 9 where more multiplexers are added. The enhanced test interface is able to switch control from the test mode to the normal operation mode with the memory contents properly manipulated. It will be found that the hardware overhead is very small, and that the function mode can be switched in a very limited number of clock cycles.
In Fig. 9 , two more multiplexers and one inverter (shaded) are added to each I/O register to select a proper path for different march operations. The selection of control signal "data, data" is dependent on whether M1/M5 or M2/M4 is being executed. For example, if a test interrupt occurs when M1 or M5 is executed and part of memory contents have been changed, then the control signal will be switched to "data" ("data") for each memory access occurring at the upper (lower) part of the memory array shown in Fig. 8 . This discussion can be extended to other march elements. Thus, normal operation can be executed without regard to any change in memory contents. Based on this scheme, we can read/write a correct data from/to each memory array by selecting a proper data path. This method also guarantees that the switch activity can be finished in a very limited number of clock cycles. That is, we do not require a significant amount of time to restore the memory data, and this enables the scheme to satisfy real-time applications.
Once a transparent BIST process has been interrupted, the process cannot be resumed by continuing from the interrupted operation. The reason comes from the fact that normal operation might change the memory contents which could cause signature mismatch even for a fault-free memory array, that is, a killing error might occur if the transparent BIST process is resumed from the interrupted operation. Thus, the entire transparent BIST process which has just been interrupted must be abandoned. Every time normal operation is completed, the memory contents must be reorganized, provided that a transparent BIST process has been interrupted and it has reversed some memory contents. The reorganization process can be easily accomplished by reversing the complemented part of the array back to uncomplemented values based on current memory contents. Note that the memory reorganization process also depends on the march element which was interrupted. For example, suppose a transparent BIST process is executing M1 or M5 and then it is interrupted with the memory map shown in Fig. 8 . After normal operation has been completed, the memory map might have been modified by writing complemented (upper memory map) or uncomplemented (lower memory map) data. The reorganization process does nothing but reverse the complemented memory map back to the uncomplemented values. After this, another round of the transparent BIST process can be initiated.
The new test strategy is so powerful that it also can handle a test interrupt which occurs during the memory reorganization process. As shown in Fig. 10(a) , an interrupt occurs when the transparent BIST process is performing the M1 march element, and the upper part of the memory array (area A) has been changed (address x is the breaking point). Now, assume that normal operation has been finished, and that the memory reorganization process is being performed. Further, suppose another interrupt occurs when the reorganization process moves to word y shown in Fig. 10(b) . It can be found that the test scheme still handles this case well by inverting all memory accesses occurring in area A 0 . We emphasize that the memory reorganization process must be done in a correct marching order to avoid any memory fragmentation. For example, the memory map left by M1 or M5 (M2 or M4) must be reorganized starting from the break point in a decreasing (an increasing) address order.
VI. FAULT COVERAGE ANALYSIS
In this section, we analyze the fault coverage of TRSMarch. Basically, TRSMarch can detect all SAFs, all TFs, as well as each of CFsts and CFids that occur in different words. The inadequacy of TRSMarch in detecting a CF affecting two cells at the same word is due to the fact that it is a word-march algorithm, thus this CF might not be detected unless all background data bits are tried during testing. However, if the write operation dominates the coupling effect [8] , then CFids are no longer existent in the same word. Thus, only each CFst in the same word requires consideration in the case of write-dominating memory arrays. However, if TRSMarch is executed many times, background data bits keep changing so they are random test patterns in nature for CFsts. We will show in this section that any CFst occurring in the same word will be detected within a very limited number of transparent BIST processes.
Theorem 1: The TRSMarch algorithm can detect all SAFs and TFs. Proof: Since each SAF will be sensitized by either operation ai ! ai or operation ai ! ai of TRSMarch, the fault will be activated by at least one test pattern (say by scan2 of march element i) and the fault effect will be read into the circular scan chain for verification [by scan1 of march element (i + 1)]. Further, horizontally redundant operations simply cause the fault effect to circulate around the circular scan chain, so the fault effect will not be masked. The detection of TFs can be discussed similarly and is omitted. Q.E.D. Theorem 2: The TRSMarch algorithm can detect each CFst between different words.
Proof: To verify the detection of CFst, we must ensure that any pair of two cells c1 and c2 in different words is sensitized by (c1, c2), (c1, c2), (c1, c2), and (c1, c2) where the address of c1 is larger than that of c2. Note that c1 and c2 represent both the cell locations and cell values left by normal operation. As shown in Fig. 11, c1 can be affected by c2 as illustrated by the solid triangles for different march elements. In M1 (M2), c2 will first change its value in the scan2 submarch element and the coupling effect at c1 will be observed in the scan1 submarch element of M2 (M3) using the signature generated by scan1 of M1 as the golden signature. Further, for M4 (M5), c2 will change its value at scan2 of M4 (M5) and the coupling effect to c1 will be observed at scan1 of march element M5 (M6) using the signature generated by scan1 of M4 as the golden signature. From Fig. 11 , it can be found that all state coupling combinations from c2 to c1 can be exercised by M1-M6, as demonstrated in the solid triangles.
The coupling relationships from c1 to c2 are represented by the dotted triangles in Fig. 11 . Again, we find that all state coupling combinations from c1 to c2 can be well exhausted by M1-M6 of TRSMarch. In scan2 of M4 (M5), c1 will first change its value and the coupling effect at c2 will be observed in scan1 of M5 (M6). Further, for M1 (M2), c1 will change its value at scan2 of M1 (M2), and the coupling effect will be observed in scan1 of march element M2 (M3). Finally, the detection of each CFst is not affected by the horizontally redundant operations as discussed in Theorem 1.
Q.E.D. Theorem 3: The TRSMarch can detect each CFid between different words.
Proof: Based on the definition of CFid, we must ensure that any pair of cells c1 and c2 in different words must be sensitized by (c2
). The former (latter) deals with the case where a transition at cell c2 (c1) with lower (higher) address forces a certain value to another cell c1 (c2) with higher (lower) address. It can be found that the former (latter) case can be detected by the solid (dotted) triangles shown in Fig. 11 . Similarly, the detection of each CFid is not affected by the horizontally redundant operations.
Q.E.D. As assumed in the beginning of this section, the write operation dominates the coupling effect, thus no CFid can occur in the same word. Consequently, only each CFst occurring in the same word requires further consideration. For each execution of the transparent BIST process, the memory background data keeps changing and works as random test patterns in nature for each CFst. Given any two bits Q1 and Q2 in the same word, we have Q1Q2 = (00, 01, 10, 11) with equal probability. For each CFst in the same word, the probability of the CFst being detected is 25%. For example, if Q1 and Q2 have a CFst where Q1 equal zero will force Q2 equal one, then this CFst will be detected by test pattern (Q1Q2 = 00). Thus, applying the transparent BIST process ten times will have the detection probability equal 1- (1-25%) 10 , about 94.4%, for any CFst occurring in the same word. Theorem 4: For a CFst in the same word, the detection probability
n while performing the TRSMarch algorithm n times. Proof: For a CFst occurring in the same word, the detection probability for each execution of the transparent BIST process is 25%. The escape probability of detecting the CFst fault by n executions of the transparent BIST process is (3=4) n . Thus, the detection probability is
VII. EXPERIMENTAL RESULTS
To evaluate the performance of TRSMarch, we have conducted several experiments using three different memory configurations. Config- Next, we assumed that four different fault types (SAF, TF, CFst, and CFid) can occur in the memory arrays. Each time, we randomly injected one fault into one memory module of each configuration and then used TRSMarch to detect the fault. The simulation process was repeated for 5000 times for each fault type and the results are shown in Fig. 12 . From the test statistics, it can be found that SAF, TF, and CFid can be completely detected. However, the fault coverage for CFst is about 50%. The reason is due to the fact that TRSMarch is a word-based test method, and many CFst faults occurring in the same word cannot be detected. Unless all background data has been exhaustively exercised, it is impossible to achieve complete fault coverage. The required numbers of read operations, write operations, and shift operations for each execution of TRSMarch are also presented as shown in Fig. 12 . In fact, the numbers are exactly those required for each memory configuration when it is physically tested on the chip. Here, we require 1200 read operations, 480 write operations, and 50 400 shift operations for each configuration. Clearly, the test time is dependent only on the maximum width and the maximum length among all memory modules in each configuration because of parallel testing. The advantage of our method will be more significant while the number of memory modules contained in one memory configuration increases. In Fig. 12 , we show two results for each memory configuration for the purpose of avoiding the statistical bias.
Although each of CFids occurring in the same word can be considered detected if write operation dominates, we are not so lucky for any CFst occurring in the same word. If a CFst occurs at two neighboring cells in the same word, then the static effect still exists after the write operation. In order to evaluate the performance of detecting each CFst fault in the same word [represented by CFst (same word)] by multiple executions of TBIST, the second experiment was conducted. First, we assumed that 5% of the total memory cells are defective, and all four different fault types were randomly injected into all modules of memory configuration 1. Since we were concerned only with CFst (same word) fault, all other fault types were ignored. Finally, TRSMarch was executed to detect all CFst (same word) faults injected. If there exist CFst (same word) faults remaining undetected, then the background data of each memory array will be changed and TRSMarch is executed again, until they are all detected. The above process was repeated five times, but only three sets of results (i.e., curves test1-test3) were presented in Fig. 13 . From this figure, it can be found that the detection ratio of each experiment dramatically increases in the first several executions of TRSMarch and then slowly reaches the level of complete detection. For test3 in Fig. 13 , the detection probabilities for 5 consecutive executions of TRSMarch are 59.3%, 77.9%, 90.7%, 97.7%, and 100%. To achieve 100% fault coverage for CFst (same word) faults, it only requires five executions of TRSMarch, if each execution has different background data. The above experiment was again repeated 20 times and Fig. 14 presents the number of iterations required (called r) by TRSMarch to completely detect all CFst (same word) faults in each experiment. For example, in the first simulation, the TRSMarch algorithm must be executed four times (i.e., r = 4) to achieve 100% of CFst (same word) fault detection. It can be found that no more than nine executions of TRSMarch are required in each of these 20 simulations.
To evaluate the relationship between r and fault distribution, the third experiment was conducted by injecting different ratios of CFst (same word) faults. Again, we assumed 5% of cells are defective and injected CFst (same word) faults to the memory arrays in configuration 1 with various probabilities. Other fault types were injected accordingly. For example, if CFst (same word) fault type was assigned 10%, then all other fault types such as SAF, TF, CFst (different word), and CFid occur equally likely. Similar to the second experiment, we finally considered only CFst (same word) faults in this experiment. Fig. 15 presents the number of iterations (r) required to achieve complete fault coverage for each simulation. From this figure, it can be observed that there is no tight relationship between r and fault distribution as demonstrated by three sets of test curves. 
VIII. DISCUSSION
Although the fault coverage of TRSMarch has been analyzed based on SAFs, TFs, and CFs; in this section, we will discuss the possibility of dealing with more advanced fault models. Recently, it has been found that many faults exist in DRAMs and SRAMs that perform read operations, and these faults can be single cell as well as coupling faults [9] . First, we discuss the detection of single-cell read-sensitive faults such as read destructive fault (RDF), deceptive read destruction fault (DRDF), incorrect read fault (IRF), and random read fault (RRF). To simplify the following discussion, the read and circulate (scan1) operation for cell c1 is called the first read operation for c1 in each march element, while the read and write (scan2) operation for cell c1 is called the second read operation for c1. We can represent an RDF by hr(c1); S(c1); O(c1)i where r(c1) represents a read operation performed at cell c1, S(c1) represents the state of c1 after the read operation, while O(c1) denotes the output for the read operation. It can be found that both the state and the output value are erroneous. Note that c1 can represent a cell location and cell value without confusion. If cell c1 has logic value 0 initially, then we can represent the RDF as hr(c1 = 0); S(c1 = 1); O(c1 = 1)i. During testing, the first read operation of M1 for cell c1 gives the result of hr(c1); S(c1); O(c1)i, while the second read operation for c1 gives hr(c1); S(c1); O(c1)i. Thus, logic value c1 will be written into cell c1 by the write operation in M1. The first read operation of M2 for c1 gives hr(c1); S(c1); O(c1)i and the fault is detected. The reason comes from the fact that the first read operations of M1 and M2 expect different logic values for cell c1. Depending on the RDF fault and the logic value of c1, the RDF fault can be detected by M1 and M2 or by M2 and M3. For example, if the RDF fault is hr(c1 = 0); S(c1 = 1); O(c1 = 1)i and the (initial) logic value of c1 is 1, then M2 and M3 jointly will detect the fault.
A DRDF fault at cell c1 can be represented by hr(c1); S(c1); O(c1)i, and the first read operation of M1 for cell c1 will give hr(c1); S(c1); O(c1)i. The second read operation of M1 for c1 gives hr(c1); S(c1); O(c1)i, so logic value c1 is written into cell c1. Thus, the first read operation of M2 for c1 gives hr(c1); S(c1); O(c1)i and the fault is detected. Again, depending on the DRDF fault and the logic value of c1, the DRDF fault can be detected by M1 and M2 or by M2 and M3. By definition, an IRF has the behavior that a read operation performed to a cell returns the inverted logic value, while the state of the cell is not changed [i.e., hr(c1); S(c1); O(c1)i].
To detect this fault, the first read operation of M1 for cell c1 gives hr(c1); S(c1); O(c1)i. The second read operation of M1 for c1 again gives hr(c1); S(c1); O(c1)i, and logic value c1 is written into cell c1. The following discussion is similar to that of RDF and is omitted here. By definition, for a RRF, the read operation performed to a cell returns a random value but the state of the cell is not changed. The discussion of RRF detection can be performed similarly to Theorem 4.
Detection of read coupling faults is much more tedious and disturb coupling fault (CFds), incorrect read coupling fault (CFir), read destructive coupling fault (CFrd), and random read coupling fault (CFrr) will be discussed. In order to enhance the power of TRSMarch for detecting read coupling faults, M1, M2, M4, and M5 are extended to M1 = fr(ai); scan1g fr(ai); circulate; r(ai)w(ai); scan2g, M2 = fr(a i ); scan1g fr(a i ); circulate; r(a i )w(a i ); scan2g, M4 = fr(a i ); scan1g fr(a i ); circulate; r(a i )w(a i ); scan2g, and M5 = fr(ai); scan1g fr(ai); circulate; r(ai)w(ai); scan2g where M1, M2 (M4, M5) are exercised in an ascending (descending) address order. Basically, we just duplicate the circular test in scan1 to the scan2 process for each of march elements M1, M2, M4, and M5. For example, the enhanced M1 first reads a word from the memory array and then circulates the word content into the MISR. The process is repeated for the remaining words in an ascending address order (scan1). In scan2, for each memory address, M1 reads a word (from the first word of the array), and circulates the word content into the MISR. Then, it reads the word again and writes the word into the array after complementing the logic values. The process is repeated for the remaining words in an ascending order. Thus, totally three read operations will be performed and two signatures will be generated by M1; both signatures will have equal value if no faults exist. The same discussion can be applied to M2, M4, and M5. In the following discussion, r()-scan1 is called the first read operation, r()-circulate the second read operation, and the read operation of r()w() the third read operation for each march element.
Assume c1 (c2) is the coupling (coupled) cell, and the address of c1 is smaller than that of c2. Let hc1; r(c2); S(c2); O(c2)i represent the coupling relationship between c1 and c2 when cell c2 is read. Here, c1 represents the coupling cell while r(c2) denotes the read operation performed at c2. Further, S(c2) gives the new logic value of c2, while O(c2) represents the output logic value of c2 for the read operation. Based on this notation, the first two read operations of M1 at c2 can be represented as hc1; r(c2); S(c2); O(c2)i, hc1; r(c2); S(c2); O(c2)i, respectively. Similarly, the first two read operations of M2 at c2 can be represented as hc1; r(c2); S(c2); O(c2)i, and hc1; r(c2); S(c2); O(c2)i. Finally, the read operation of M3 at c2 can be represented as hc1; r(c2); S(c2); O(c2)i. For example, in the first read operation of M1 for c2, the states of c1 and c2 are not changed since there is no write operations involved. However, the state of c1 in the second read operation for c2 has changed. The reasons are: 1) c1 has a smaller address than c2; 2) M1 is an ascending march element; and 3) the third read operation of M1 reads a word and writes back the word immediately. Now, if c1 and c2 are affected by a CFrd fault, then the fault will be detected by M1-M3. Assume c1 = 0 and c2 = 0 are left by normal operation, and c1 and c2 are affected by a CFrd when c1 = 1 and c2 = 1. Obviously, the fault will be detected by the first read operation of M2 at c2 (< c1; r(c2); S(c2); O(c2) >), since S(c2) and O(c2) will be changed to S(c2) and O(c2) [where O(c2) will give the incorrect logic value to the MISR]. Detection of other CFrd faults can be discussed similarly, since all four different coupling relationships between c1 and c2 will be sensitized and observed. It is interesting to discuss the case where the initial states of c1 and c2 are sensitized as a CFrd. Intuitively, a read coupling fault which occurs in the first read operation of M1 may not be detected, since there is no golden signature to compare. Even worse, it is used as the golden signature for all other march elements. It is surprising that this special kind of read coupling fault still can be detected. Assume we have c1 = 0, c2 = 0 left by normal operation; c1 and c2 are affected by a CFrd when c1 = 0 and c2 = 0. In this case, the first read operation of M1 at c2 gives hc1 = 0; r(c2 = 0); S(c2 = 1); O(c2 = 1)i and the second read operation of M1 for c2 results in hc1 = 1; r(c2 = 1); S(c2 = 1); O(c2 = 1)i. Further, the first read operation of M2 at c2 gives hc1 = 1; r(c2 = 0); S(c2 = 0); O(c2 = 0)i and the second read operation of M2 for c2 generates hc1 = 0; r(c2 = 0); S(c2 = 1); O(c2 = 1)i. The fault is detected since the first and the second read operations of M2 generate different outputs. Similarly, M4-M6 can detect any CFrd where c1 has an address larger than c2.
We next discuss the case of CFir between c1 and c2 where c1 has a smaller address than c2. Again, the fault can be detected by M1-M3. Assume c1 = 0 and c2 = 0 initially, and c1 and c2 are affected by a CFir when c1 = 1 and c2 = 1. Obviously, the fault will be detected by the first read operation of M2 at c2 (< c1; r(c2); S(c2); O(c2)) as in the case of CFrd. However, the detection for a CFir where the initial states of c1 and c2 are sensitized as a CFir is quite different. Again, assume c1 = 0 and c2 = 0 are left by normal operation, and c1, c2 are affected by a CFir when c1 = 0 and c2 = 0. In this case, the first read operation of M1 at c2 gives hc1 = 0; r(c2 = 0); S(c2 = 0); O(c2 = 1)i and the second read operation of M1 at c2 results in hc1 = 1; r(c2 = 0); S(c2 = 0); O(c2 = 0)i. The fault has been detected by M1 since M1 expects the same output for both read operations. Similarly, M4-M6 can detect any CFir where c1 has an address larger than c2.
By definition, a CFds fault occurs if the v-cell undergoes a transition due to a wx or rx operation applied to the a-cell. We first discuss the case in which the CFds fault (denoted by CFdsr) occurs due to a read operation at the a-cell. Let c1 (c2) represent the a-cell (v-cell), and c1 has a smaller address than c2. The CFdsr fault can be detected by M1-M3. Assume c1 = 0 and c2 = 0 initially, and c1 and c2 are affected by a CFdsr when c1 = 1 and c2 = 1. Again, this fault can be detected by the first read operation of M2 at c2. We next analyze the case where the initial states of c1 and c2 are sensitized as a CFdsr. Assume we have c1 = 0, c2 = 0 left by normal operation, and c1, c2 are affected by a CFdsr when c1 = 0 and c2 = 0. In this case, the first read operation of M1 at c2 gives hc1 = 0; r(c2 = 1); S(c2 = 1); O(c2 = 1)i and the second read operation of M1 at c2 results in hc1 = 1; r(c2 = 1); S(c2 = 1); O(c2 = 1)i. Further, the first read operation of M2 at c2 gives hc1 = 1; r(c2 = 0); S(c2 = 0); O(c2 = 0)i and the second read operation of M2 at c2 generates hc1 = 0; r(c2 = 1); S(c2 = 1); O(c2 = 1)i.
The fault is successfully detected since M2 expects the same output for both read operations. Similarly, M4-M6 can detect any CFdsr where c1 has an address larger than c2. The case of a CFds fault due to a write operation can be discussed in the same manner. If the read operation applied to the v-cell returns a random value when the a-cell is in a certain state, then the fault is called a CFrr. The discussion of CFrr detection can be performed similarly to Theorem 4 and is thus omitted here. Further, we emphasize that the added march operations for read coupling faults do not affect the detection of SAs, TFs, CFs described in Section VII and single-cell read-sensitive faults. Note that the added read operations will cause early detection of some single-cell read-sensitive faults. Finally, the enhanced TRSMarch algorithm can detect each read coupling fault discussed above, when the a-cell and the v-cell are at different words. However, each read coupling fault which occurs at the same word may or may not be detected, and this can be discussed as in Theorem 4.
IX. CONCLUSION
In this paper, we have proposed an efficient transparent BIST method to concurrently test multiple memory arrays that are spatially distributed on the entire chip. To achieve this goal, a transparent test interface is developed to implement the functions of test pattern generation and test response evaluation. The interface is, in fact, composed of a circular scan chain and several multiplexers and has the advantage of low hardware overhead. Test responses are evaluated by a single global MISR which further reduces the hardware overhead significantly. We have also developed a very powerful signature analysis method to eliminate the tedious signature prediction process with almost no extra hardware cost. Based on the memory background data, an efficient march algorithm, TRSMarch, has also been developed to generate test patterns and expected test results. The TRSMarch algorithm contains six march elements and can detect SAFs, TFs, and each of CFsts, and CFids occurring in different words. Based on the assumption of write-dominating coupling, CFid is not existent in the same word. Theoretical analysis also indicates that each of CFsts occurring in the same word can be detected by a limited number of executions of TRSMarch. More advanced fault models such as single-cell read-sensitive faults and read coupling faults can be detected by adding more operations to the TRSMarch algorithm. By allowing redundant operations in TRSMarch, all memory arrays receive the same control signals and this greatly reduces the hardware overhead without losing fault coverage. Another very attractive feature of the proposed method is the capability of handling test interrupts on a real-time basis. Again, our purpose is achieved by investing very limited hardware cost.
In summary, the proposed transparent BIST method has the following four advantages: 1) short test time because of parallel testing; 2) low hardware overhead because of the scan interface design, single set of control signals, and single global MISR; 3) short test interrupt response time because of the powerful context switching design; and 4) high fault coverage because of the powerful march operations. Although multiple applications of TRSMarch allows the detection of CFst faults in the same word, in the future, this should be improved to detect the faults as soon as possible. Further, detection for each of CFids in the same word with coupling-dominate effect must be considered as well. Extra multiplexers are added to deal with the test interrupt problem, and the effect of increased access time must be considered by merging multiplexers (wherever possible) or by using buffers and D-FFs with high current supply.
