Abstract. This article presents a design strategy for efficient and comprehensive random testing of embedded random-access memory (RAM) where neither are the address, read/write and data input lines directly controllable nor are the data output lines externally observable. Unlike the conventional approaches, which frequently employ on-chip circuits such as linear feedback shift register (LFSR), data registers and multibit comparator for verifying the response of the memory-under-test (MUT) with the reference signature of a fault-free gold unit, the proposed technique uses an efficient testable design, which helps accelerate test algorithms by a factor of 0.5'~, if the RAM is organized into an nxl array and improve the test reliability by eliminating the LFSR that is known to have aliasing problems. Another serious problem in embedded memory testing by random test patterns is the problem of memory initialization, which has been tackled here by adding word-line flag registers. The paper has made indepth empirical studies of the functional faults such as stuck-at, coupling, and pattern-sensitive by suitably representing these faults by Markov chains and by simulating these chains to derive various test lengths required for detecting these faults. The simulation results conclusively show that, in order to test a 1M-bit RAM for detecting the common functional faults, the proposed technique needs only one second as opposed to about an hour needed by the conventional random testing where memory cells are tested sequentially.
Introduction
As Ultra Large Scale Integrated (ULSI) chips with several billion transistors are becoming realities, powerful computer systems with gargantuan memories, fast processors and complex control circuits for several types of analog and digital I/O data are envisioned to be integrated within a single chip or wafer. Most of the embedded memories in such ULSI/WSI chips will have poorly controllable address, read/write and input lines, *An abridged version of this article was published in the IEEE International Conference on Wafer-Scale Integration, January 1989. This research was partially supported by the NSF under grant munber MIP-9013092 and by ONR under grant number 85-K-0716. and also poorly observable data output lines Several strategies employing built-in self-test (BIST) circuits have been proposed in the past [1] , [2] , [3] , [4] , [5] for testing memories embedded within a ULSI/WSI chip. Jain and Stroud [1] proposed a simple scheme for testing embedded memories and their scheme was incorporated in SRAM module generators so that embedded SRAMs could be automatically augmented with the proposed BIST circuit. Mazumder and Patel [6] proposed a parallel testing scheme where in the test mode the decoder allowed the BIST hardware to simultaneously read the values of and write into multiple cells, thereby the cells were verified by comparing within themselves. Sridhar [5] employed a built-in parallel signature analyzer to simultaneously read and write on all the cells in a word-line and he developed parallel versions of functional test algorithms which could be externally applied to the parallel signature analyzer through serial scan-in line, or could be internally generated by a BIST circuit. BIST circuits in the above designs usually require large silicon overhead, and are also difficult to connect to memory arrays which are frequently distributed all over a chip in the form of register banks or small local memories. In order to tackle this problem, Nadeau-Dostie, Silbert and Agarwal [4] proposed a serial interfacing scheme in which several embedded memories shared the same BIST circuit and therefore considerable amount of routing overhead was saved. The serial interface allowed the BIST circuit to control a single bit of RAM's (or a group of RAMs') input data path. Only one bit of the output data path was made available to the BIST circuit for observation while the test algorithms were executed. The other bits in a wide-word memory organization were controlled and observed indirectly through the serial data path built using the memory itself and a set of multiplexers. In this article, a new strategy, which tests embedded memories by randomly generated test vectors, has been developed and examined as a viable alternative to deterministic built-in self-testing. Because of the presence of widely heterogeneous functional elements within a ULSI/WSI chip, random or deterministic test patterns applied at the external pins can be fairly randomized, and frequently then can be directly applied to the embedded arrays without requiring a separate built-in pseudorandom pattern generator. The proposed random testing technique can be easily adapted in embedded memory arrays and is expected to test them more rapidly than those used by the previous researchers [7] , [8] .
The conventional random testing approaches often employ a pseudorandom generator to excite the address and control lines in a RAM, and the response is stored in a linear feedback shift register (LFSR). After applying a preassigned number of test vectors, the signature in the LFSR is compared to a fault-free signature obtained from a gold unit by applying the same set of pseudorandom test vectors. Two potential problems with such conventional approaches can be: (i) an LFSR often introduces a small amount of error in the form of aliasing, and (ii) it is mandatory to initialize the memory cells deterministically so that the basis of comparison of signatures of the memory under test (MUT) with that of the gold unit is valid. Moreover, since, in many conventional random testing schemes, the memory cells are sequentially accessed, an enormous number of test vectors are needed by these testing techniques. It has been shown that in an n-bit RAM all stuck-at faults can be tested by applying about 49n random patterns, [7] as opposed to only 4n patterns in a Column Bar Test [9] .
This article proposes an efficient strategy for testing embedded RAMs by pure random (as opposed to preassigned sequences in the conventional pseudorandora) test patterns. The proposed technique eliminates the LFSR, and it employs a parallel memory testing strategy to accelerate testing times by a factor of 0.5~rn. The problem of memory initialization has also been solved. The salient features of this paper are: (i) a new design strategy for testing embedded memories by pure random test pattems, (ii) elimination of memory initialization problem, (iii) a comprehensive study of random patterns required for testing functional faults in a RAM array, (iv) reduction of test complexity by a factor of ~fn, (v) a large fault coverage; for example, 1200n test vectors applied to an n-bit RAM can test all functional faults, including pattern-sensitive faults in any arbitrary 5-neighborhood, where individual cells may be located anywhere in the array, and finally (vi) improvement of reliability of random testing by eliminating the LFSR and the storage register. The proposed design can be used as an alternative approach to BIST techniques which use deterministic algorithms. The large number of random test vectors (about 1800 per cell) needed to test the functional faults, like stuck-at, 2-coupling [10] , [11] and pattern-sensitive faults [12] , [13] are efficiently used to test 0.5~fn cells in parallel, and thereby the overall time required by the proposed random testing is less than 1 sec for testing a lM-bit RAM with 50 nsec memory cycle time. For each n-bit memory array, the proposed design uses about an extra 2~ffftransistors in place of the LFSR, comparator and storage register in the existing designs.
The rest of the article has been organized as follows. Section 2 makes an in-depth analysis for examining the fault coverage by random test patterns. Markov chains represent the different functional faults, and the number of test vectors needed to detect these faults are estimated by numerical techniques. In order to detect the patternsensitive faults over a neighborhood of five cells, about 1803 million test vectors are required for testing a 1M-bit RAM array. This test time can be reduced by 512 times by employing a new design-for-testability scheme discussed in Section 3. The scheme allows many memory cells to be tested in one memory cycle, and thereby it significantly reduces the test time. Moreover, since the faulty cells are identified by mutually comparing the contents of the memory cells that are always accessed simultaneously for read and write operations, the proposed design does away with the LFSR and sig-nature storage register. Another potential advantage of the proposed design is that the memory is not needed to be initialized in a deterministic manner. The conventional random testing requires that the memory must be initialized before pseudo-random test patterns are applied so that the resultant content of the LFSR can be compared correctly with that of the gold unit. In embedded applications, memory cells cannot be initialized externally.
A Comprehensive Study of Fault Coverage
In this section, a comprehensive study is made for estimating the test length coefficient (defined as the number of test vectors needed per memory cell) for 99.9% detection quality. Each type of functional fault is represented by an appropriate Markov chain, and the probability of fault detection has been enumerated by using numerical methods. Three most common types of functional faults are: stuck-at, coupling, and patternsensitive [9] .
Markov Model for Stuck-at Fault
Principally there are two types of operations on a memory cell--read and write. A fault-free read operation does not change the state of a cell, while a write operation may or may not change its state. If a write operation on a cell x changes its current state s(x) to ~(x), then the operation is called a transition write. If s(x) changes from 0 to 1, then the operation is denoted by l"(x). Ifs(x) changes from 1 to O, then the operation is denoted by $ (x). Faults in the memory cells are detected by transition write operations. If ]'(x) is always faulty irrespective of the content of other cells, then the cell x is said to be stuck-at O. Similarly, if $(x) is always fault, the cell x is then stuck-at 1. Figure 1 shows the Markov chain for a cell that is being stuck-at O. State So represents that a memory cell contains O, state S1 represents that a memory cell contains 1, and state $2 represents a memory cell is read and a stuck-at 0 fault is being detected. Let the probabilities of writing a O, a 1, Initially, the memory cell is at state So with a probability Io(=po(So) ) or at state S 1 with a probability 1-Io(=po(SO). For a memory cell which is stuck-at 0, the fault is sensitized if a transition write wl is made. The cell cannot be initially at state $2 which is an absorbing state, and, therefore, the probability p0(S2) that the cell is in state $2 before any test vector is applied, is 0. In figure 1 and subsequently in all Markov chain diagrams, the initial states are shown by light circles, the states in which faults are sensitized are shown by double circle, and finally the detecting state is shown by a heavy circle. The detecting state is an absorbing state, and it is reached by a read operation from one of the sensitized states. The probability PL(S2) that the fault is detected after L test vectors are applied, gives the confidence level and is known as quality of detection (denoted by QD). The probability that even after L test vectors are applied, the fault is not detected is known as escape probability, e = 1-Qo = pc(so) + pL(S1), where pr(So) and pL(S1) are the probabilities that after L test vectors are applied, the cell is in state So and $1, respectively. These probabilities can be computed from the following set of equations:
The above equations represent linear recurrences and can be solved mathematically. In figure 2 , the state probability diagram of the stuck-at 0 fault is shown, where the triplet 7r L = [PL(So), PL(S1, pL(S2] denotes the state probability after L-th test vectors are applied to the initial condition (i.e., iterations are made). It may be noted that at the 47-th iteration the probability of detection, p(S2), increases to 0.999 from the initial value of 0 at either state So or $1. As it can be seen in subsequent fault models, the analytical solutions for these Markov chains with an absorbing state becomes quite cumbersome. Numerical techniques have, therefore, been applied in this article by running statistical packages on a Vax 11/780. Table 1 shows how the quality table 1 for the cell being initialized both deterministically with unity probability to 0 (1) and randomly (i.e., initially it is in state So or $1 with probability 0.5). It can be seen that the probability of fault detection, QD is a monotonically increasing function of testlength coefficient, and depends on the initial content of the cell. In figure 3 , the quality of detection is plotted for various test-length coefficient, and it can be seen that on the average about 47 test vectors are required to detect the stuck-at faults for a quality of detection of 99.9 %. Also plotted are the number of cells (samples) vs. the test length required to detect the fault. It can be seen that for a large number of samples the fault is detected by applying less than 10 vectors. In this analysis, it is assumed that all the cells in the memory have uniform access probability, p, = I/n, where n is the number of cells. But, in practice, due to the presence of combinational logic circuits very frequently, this access probability may not be uniform. If an address line is selected by a k-input AND or NOR gate, then the address line will contain 1 with a probability of 1/2 k, and it will contain 0 with probability of 1-(1/2k). These address line probabilities will be exactly opposite if the line is selected by a k-input OR or NAND gate. Thus, the presence of combinational logic on the path of address lines considerably modifies the access probability. Figure 4 illustrates how the test length coefficient changes with the signal probability of address and data lines, where the signal probability has been varied over a wide range (from 0.1 to 0.9 in the case of the data line and from 0.25 and 0.75 in the case of address lines--beyond which the test length tends to increase rapidly). Two cells, say x and y, are said to be 2-coupled if a transition write on x changes the state of y, whenever y is at a preassigned state, ay ~ {0,1}. Such a fault is denoted by the doublet <9(x), ay>, where xI, is a transition write operation. Principally, there are four variants of this doublet, namely, < 1",0>, < $,1 >, <,~,0>, and <$,1>. If only one of the above 2-coupling faults exist between two arbitrary cells x and y, they are said to be 1-way coupled. The cells are said to be 2-way coupled, if a combination of two or more of the above faults coexist. These faults can be denoted by <$,X>, <$,X>, <$$,0> ,<$$,1>, and <$,L,X>, where X is a don't care state.
Markov Model for Coupling Fault
For each type of coupling fault described above, a separate Markov chain can be represented. In this section only one type of 1-way coupling fault is represented corresponding to < $,0>, i.e., when a transition write ? is made on a coupling cell x, the coupled cell y changes state only if it is in state O. Figure 5 represents the Markov chain for the corresponding coupling fault. States $4 and $5 represent the fault being sensitized and correspond to < 01 >, < 11 >, respectively. In any of these two states if the cell y is read, then the fault will be detected, and the corresponding absorbing state is given by $6. Markov transition matrix is constructed from figure 5 and using numerical technique, the quality of detection is enumerated (in figure 6) as a function of length coefficient. For a detection quality of 99.9%, on the average 220 (228) test lengths are required if the memory is deterministically initialized to 0 (1), and about 225 test length coefficient is required if the memory is randomly initialized. For other types of two-cells coupling faults, the Markov chain analysis has been made, and the length coefficients for different faults are shown in figure 6 and table 2. In embedded DRAMs with destructive read operation, the cell which is read is rewritten with the original data during restoration phase. Thus, the read operation is potentially vulnerable in the sense like the transition write operations, the read operation may introduce coupling faults. In figure 7 , the Markov chain for the coupling fault l"x= > l"y with destructive read operation is shown. It may be noted in the initial state if ay =0 and the cell x is read, the coupling fault might occur, which is shown by heavy arc labeled rx. To test DRAMs with destructive read operation, only about 87 test vectors are needed to detect the coupling faults with 99.9% confidence. In Appendix 1, the Markov chains for the different coupling faults are shown.
Markov Model for Pattern-Sensitive Faults
A pattern-sensitive fault occurs in a neighborhood of 3 or more cells. A pattern-sensitivity is called static (SPSF), if a transition write cannot be made in the presence of specific patterns in the neighborhood. Like the coupling faults, an SPSF is denoted by a doublet < 't'S >, where S is the specific patterns in which the transition write operation '~ on the base cell is faulty. A pattern-sensitivity is called dynamic (DPSF), if a transition write operation 'I' on a cell in the neighborhood changes the state of the base cell, provided it is in the preassigned state. A DPSF is denoted by a triplet < ax,~(y),S>, which indicates that if the base cell x, is in state ax and a transition write ko is made on cell y in the neighborhood, then patternsensitive fault occurs if and only if the other cells contain specific pattern S. Both the SPSFs and DPSFs can be analyzed using Markov chains. Figure 8 represents an SPSF, < t,S >, where the base cell x cannot make an operation t in the presence of a specific set of patterns S in the neighborhood. Let the doublet < y,S > denote the different states in the Markov model for the fault. Let S = 1 denote the presence of specific patterns in the neighborhood and S = 0 denote the absence of specific patterns. Initially, the cells may be in one of the four states So, $1, $2 and $3 corresponding to the doublets <00>, <01 >, <10> and <11 >, respectively. States $4 and $5 denote the fault is being sensitized, when y changes from 0 to 1 in presence of S = 1. State $6 detects the fault and is the absorbing state. Similar to the other techniques, we computed the various length coefficients for different qualities of detection and for different size of neighborhood. This is shown in figure  9 . Table 3 shows the test length coefficient for SPSFs with different neighborhood size. Hg. 7. Coupling fault in DRAM with destructive read. sensitized. There are four states indicated by double circles which sensitize the faults, and it can be seen that only two of these states can be entered from the initial states when S = 1. After the fault is sensitized, the fault is detected in the absorbing state represented by dark circle when the base cell is read. Figure 11 shows the quality of detection as a function of test length coefficients. Table 4 shows the test length coefficient for DPSFs with different neighborhood size.
Qt) SX= .t Y SX~ t Y "rX= ~ Y ~fX= ? Y ~ J,X= ,~ Y t.LX~ ~ Y ~ $X~ Y~ -~ Y J.X= Y-~ -~ Y "fX~ Y~ "~ Y

1-w,-ws
A New Testable RAM Design
From the above analysis it can be seen that, for accomplishing a detection quality of 99.9 %, about 1800 million random patterns must be applied to test all functional faults in a 1 M-bit memory chip organized into a 1K x 1K array. Moreover, as discussed in earlier section, the conventional approaches using LFSR and comparator suffer from several limitations, such as aliasing, relatively large area overhead and initialization problems. In this section, we demonstrate how these two potential problems associated with the conventional random testing can be circumvented by introducing a new design-for-testability approach. In the proposed architecture, the memory subarrays are reorganized as shown in figure 12 . An n/dxd memory is organized knto d subarrays of n/dxl size. Each subarray utilizes the organization of parallel testing discussed in [14] , [15] . By utilizing this testable design, not only is the overall test length reduced by a factor of'~/d, but also are the LFSR and storage register eliminated in random testing. In order to access multiple cells, the bit-line decoder has been modified such that all the even lines and odd lines are accessed simultaneously. This is achieved by adding two extra transistors Q8 and Q9 (in figure 13) to the decoder which will select the odd/even bit lines together through SELECT independent of the address bit pattern. If the write operation is selected, all the even (odd) bit cells in a word line are written simultaneously 0 or 1 based on the data in Data-In buffer. During a read operation, the contents of these cells are compared together by a parallel comparator which detects an error if all the even (odd) cells in a word line do not match. A mismatch is indicated in the error latch by setting the ERROR line high. Figure 14 shows the parallel comparator and error detector. The p-channel transistors TI .... , Tin-1 are connected in parallel and detect a concurrent occurrence of l's in bit-lines. The n-channel transistors Pl, The output of the Ex-Or gate is connected to an error latch consisting of transistors Vo, 9 9 113. The error latch output is ERROR=0, when the selected bit lines are identical. If the bit lines are not identical, then both S1 and $2 remain cutoff and the detector output is 1. This triggers the error latch setting its output to ERROR=I. During the write phase and normal mode of operation, the error latch is clamped to zero by V 4. The error detector is inhibited by the discharge transistor Pm during the start of the read phase when the sense amplifiers outputs are not identical because of sluggish changes in some of the sense amplifiers. The conventional RAM testing algorithms can be easily applied to this new architecture and a test speedup of 1000 can be achieved for a 4M-bit RAM. New parallel algorithms for testing pattern-sensitive faults have been proposed in [6] , and built-in self test circuit has been designed for the proposed test algorithms. The proposed BIST design does not require too much extra circuitry and does not degrade the memory cycle time in normal applications. It can be applied in embedded applications where the memory is organized into a single large array inside the ULSI/WSI chip. The intent of this article is to demonstrate how this testable design can be applied for random testing in embedded applications where the memory is distributed in multiple arrays inside a ULSI/WSI chip. Because of the routing complexity associated with conventional selftesting techniques, random pattern testing is likely to be more desirable in many applications where embedded registers and small amount of SRAMs will be scattered all over a chip.
The shortcomings of the conventional random testing can be circumvented if the proposed testable architecture is employed for random testing. Since the cells written together are compared simultaneously, the problem of observability will not exist and no LFSR will be needed nor will any predetermined pseudo-random sequences be necessary to test the memory. In order to ensure that the technique does not need any deterministic initialization, the word-line decoder is added with a latch which is set whenever a write operation is done utilizing that word line. While reading the word line, the result of comparison activates the Error Detector if the latch is set, otherwise, the Error Detector is disabled, and thereby incorrect comparisons due to initial arbitrary data patterns in an uninitialized memory plane are not allowed to maliciously corrupt the testing scheme. The modified circuit with write enable latches for word-lines is shown in figure 15 . For each wordline, there are two latches, Lf and L ~ as shown in Table 5 compares the proposed random testing with the conventional random testing schemes where an LFSR is used and the cells are tested in a sequential fashion. accessing multiple cells in a word-line during a single memory cycle, and thereby the overall time to test the embedded RAM is considerably reduced in comparison to many conventional approaches that use sequential cell testing. The article has made detailed empirical studies of the well-known functional faults, namely stuck-at, coupling and patterns-sensitive, using the Markov chain model for various types of faults. From the analysis, it is observed that, in order to test a 16M-bit (1 M-bit) RAM for these commonly occurring functional faults, the proposed technique needs only 4 (1) seconds as opposed to about 16 (4) hours in the conventional random testing where memory cells are tested sequentially (see figure 17) . The proposed testable design uses about 2"/n extra transistors to build a multibit 0/1 detector, and evidently has low silicon overhead. The modified memory architecture can also be used for external testing and built-in self-testing by deterministic test algorithms [6] , [16] . 
