ABSTMCT
INTRODUCTION
Dual-port random access memories @PRAMS) allow simultaneous access of stored data from two ports as compared to access from only one port in conventional single-port RAMS (SPRAMs). Whereas this may be used to speed up testing of DPRAMs, it does complicate its array fault model as well as decoder fault model resulting in more complex test algorithms. Fault modeling for SPRAMs has been thoroughly investigated [ 11. However, in spite of the growing use of DPRAMs, limited work on the testability issues of these devices has been reported. An ad-hoc test technique with no specific fault model was described in [2] . Serial test algorithms for an embedded DPRAM was reported in [3] with mostly single-port faults. To account for the dual-port nature of the memory, a special shadow write operation was designed to test for bridging faults between bit lines and between word lines of opposite ports. In 141, test algorithms whch cover stuck-at faults, bridging, and transistor stuck-opedON faults were reported. The above approaches do not take into account complex pattern sensitive failures which become more common as transistor and memory cell sizes get smaller [5] . The notion of complex coupling faults for DPRAMs was introduced in [6] where O(n2) tests were developed. By imposing topological restrictions on the relative locations of the coupling and the coupled cells, the test length was reduced to O(n) [7] . Even though this approachhas accounts for the special features of DPRAMs, it still uses the single-port decoder fault model overlooking complexities introduced by the second port.
To properly address the testability problem of DPRAMs, new fault models covering complex pattern sensitive faults should be adopted, the effect of the added complexity of the second port should be considered, and efficient test algorithms should be developed. An efficient test algorithm should have a low complexity to maintain reasonable test time for large size memories, and it should also be simple enough to implement as on-chp BIST logic without incurring unacceptable chip area overhead or degradmg performance. This paper achieves these goals by introducing new fault models and appropriate circuit modifications. The proposed array fault model covers a new class of pattern sensitive faults, Duplex Dynamic Neighborhood Pattern Sensitive Faults (DDNPSF), which account for failure modes expected in this type of devices.
An O(&) anay test algorithm covers, in addition to DDNPSF faults, all stuck-at faults, all static neighborhood pattern sensitive faults for a neighborhood of size 5, and a restricted class of complex coupling faults [6] . An O(&) test algorithm venfies a modPiied decoder fault model which accounts for the interaction between the address decoders of both ports. The proposed test algorithms are simple enough to implement as BIST logic. The low to the O(&) reported in [8] . Such simplicity and area efficiency are essential for efficient BIST implementation. The number of cells selected in parallel is independent of the address as opposed to the scheme reported in [9] .
The adopted memory array tiling [lo] partitions the memory cells into 8 distinct groups leading to test algorithm complexity of a higher constant multiplier compared to the 5-group tiling proposed in [SI. Whereas the :%group tiling fits the DFT approach adopted in [SI, its use would result in a complex BIST implementation since the power-of-two nature of practical memory array sizes does not allow simple division by 5. The 8-group tiling is adopted since it leads to simpler test algorithm and BIST implementation. In addition to parallel access of row line data, parallel comparison of data is performed by two multiple input comparators which flag an error signal wheinever non identical data are detected.
The D P W model consists of p sub-arrays each of r x q dual-port memory cells for a memory size of n =pqr.
The BIST logic is designed to test the p sub-arrays in parallel thus cutting the test time by a factor of p . In the following analysis, only one such sub-array is considered. Each1 sub-array has r rows and q columns of dual-port memory cells. Each cell has two identical access ports; a and b. Each access port is associated with one row/word line and one set of bitldata line@). Both row lines are identified by the same row address and both sets of bit lines, are identified by the same column address. Thus, each access port has its own row decoder, column decoder, sense amplifiers, and inputloutput buffers. Control, timing and arbitration circuitry is common to both ports. The model assumes the use of one sense amplifier per bit line. 
TESTABILITY ADDED FEATURES
A number of design modifications are proposed to simplilfy the BKST logic and allow efficient O( ' ) test algorithms. In the <array test mode, the modifications provide parallel access of multiple cells on the two addressed rows. Each row in the memory array is partitioned into exactly four sectors, with each sector having q/4 bits. Each port may write data into one full sector on any given row, and therefore the column decoder should allow accessing a total of 414 bits in parallel. To achieve this, two circuit modifications are necessary. First, the write amplifier should be made powerful enough to drive the accessed bit lines. This, however, should not result in unacceptably high current spikes. If necessary, the write cycle during the test mode can be extended to avoid such spikes. Second, the column decoders of both ports should allow sielection of multiple bit lines in the array test mode. This is accomplished by allowing only two column addresses (Ayo and Ay,) to assume arbitraq values while the true and complement outputs of other column address buEm are forced to a logic 1 state (by setting control signal C2 to logic 0 as shown in Figure 1 ). The array test algorithm includes verification read steps which veri@ the integrity of stored data in the array. These s t q s v e r a that half the array cells (the base cells) contain certain identiical background data; either all 0's or all 1's. This can be accomplished using rq/2 single read operations or rq/4 double read operations (using both ports). Instead, two coincidence comparators are used to simultaneously venfy two sets of data. Each set consists of data stored in halfthe cells on one of the two accessed row lines (q/2 cells). The cells of one set (accessed by one port) are the oldd-numbered cells on one row, while the cells of the other (simultaneously accessed by the other port) are the evien-numbered oines on another row. Thus, verifying half the array cells (rq/2 cells) needs only r/2 such parallel verification steps. According to the memory tiling used, exactly two sectors per row per port will be involved in such a verifying read operation. Thus, in one verification read slep both ports are used to access four sectors on two different rows in parallel. Data verification is accomplished by adding one parallel coincidence comparator circuitry for each port. The inputs to the comparator of port a ((port b), are the outputs of the sense amplifiers of half the array bit lines corresponding to the even-mnumbered (odd-numbered) columns of the port.
Figure2 shows the logic diagram of the comparator circuitry with inputs from the even-numbered bit lines. It consisits of two NOF! gates N1 and N2 the outputs of which (a and p) are inputs to a third NOR gate N3. The output of N3 is the "'Error" signal. Inputs to N1 are the true outputs (So, S2, ..., Sq-2) of the even-numbered sense It should be noted that the double simultaneous transitions of the two deleted neighborhood cells can be either in the same direction or in opposite directions. If the transitions are in the same direction, they can be either positive (where the two cells change states from 00 to 1 l), or negative (where the two cells change states from 11 to 00). If the transitions are in opposite directions, i.e. mixed, one cell changes state from 0 to 1 whle the other changes from 1 to 0. Therefore, DDNPSF can be further classified as either positive, negative, or mixed While the effect of a single transition in a deleted neighborhood cell may not be strong enough to show as a fault in the base cell, the effect of two simultaneous such transitions is stronger and is more likely to show as a DDNPSF if the effects of both transitions are additive. Such additive effect will occur when the double transitions are in the same direction. It is also reasonable to assume that double transitions in the same direction will sensitize most faults caused by single transitions in any of the two deleted neighborhood cells undergoing the transition. In other words, most DNPSFs are detectable by DDNPSF tests. Therefore, we will limit our discussion to positive and negative DDNPSFs. To test for positive (negative) DDNPSF, each base cell must be read in state 0 and in state 1, for all possible positive (negative) duplex changes in the deleted neighborhood patterns. To minimize the total number of double write operations required to step through all such changes in the deleted neighborhood patterns, we extend the notion of Eulerian sequence [S, 13 to handle DPRAMs. Table 1 . Each of subtourl, and subtour2 has a length of 6. In addition, a link of two double write operations is needed to move from the end of subtourl (000) to the beginning of subtour2 (111). Thus, such a test sequence has a length of 14. In addition to testing for DDNPSFs, the EES covers SNPSF faults as well since all of the graph vertices are visited. Subtoiirl is an optimal Eulerian subtour of length 24 and is obtained by trarversing each path between 0000 and 1111 in both directions. One such subtour is shown in Table 2 . An optimal Eulerian subtour2 of length 24 for the second subgraph of Figure 6 (b) is shown in Table 3 . Therefore, an optimal extended Eulerian sequence for k = 4 will have a length of 49. It consists of subtourl (length 24), a link (length 1) from the end of subtourl to the beginning of subtour2, and subtour2 (length 24). i.e. traversing arc ag in subgraph 1 of the DDNPSF graph ( Figure 6(a) ). A series of (00 to 11) double write operations to sectors S and Won rows (0, l), (2, 3), (4, 5), ... , (Y-2, r-1) will test base cells e and s for the positive duplex transition 0000 to 0011, while base cells n and w are only tested for two successive single transitions from 0000 to 0010 then to 001 1. Therefore, to test all base cells for all possible duplex transitions, a proper arc traversal procedure is followed. In this procedure, a given bidirectional arc is traversed in both directions (e.g., 00 -+ 11 -+ 00) in a repeated manner such that all base cells are covered. The procedure also includes verification cycles in between global write operations. As an example, consider the case of traversing the bi-directional arc ag (0000 t) 0011) in subgraph 1 (Figure 6(a) ). A series of double write operations to all cells designated W and S in the memory array need to be performed. For all the n,e,w,s base cells to be exposed to the required deleted neighborhood duplex transitions (0000 to 001 1 and 001 1 to OOOO), such series of writes is performed in two phases with two passes per phase. Depending on the row addresses of the W and S sectors selected to be written simultaneously, either cells e and s (phase 1) or cells n and w (phase 2) will be subjected to the required positive and negative duplex transitions (0000 t) 0011) in their deleted neighborhoods. In the)rst pass of the jrst phase, a sequence of double write operations (0000 -+ 0011) to sectors Sand Won rows (0, l), (2, 3), (4, 5), ... , (r-2, r-1) is performed in r/2 double write cycles (see Figure 7 ) . This pass will test base cells e and s for the positive duplex transition 0000 to 0011, while base cells n and w are only tested for two successive si@ transitions from 0000 to 0010 then to 0011. This is followed by a sequence of verification read cycles (r/2 cycles) to verify the data in the base cells. In the secondpass, a sequence of (0011 to 0000) double write operations is applied to the same set of W and S sectors (r/2 cycles). This will test base cells e and s for the negative duplex transition 00 11 to 0000, while base cells n and w are only tested for two successive single transitions from 001 1 to 0001 then to 0000. This is also followed by another sequence of verification read cycles (r/2 cycles). Thus, in the first phase, base cells e and s are tested for the positive and negative duplex transition (0000 t) 0011) in their deleted neighborhood. This is accomplished in r double write cycles and r verification read cycles for a total of 2r cycles. The second phase is similar to the)rst with the exception that instead of cells e and s, base cells n and w are tested for the positive and negative duplex transitions (0000 t) 001 1) in their deleted neighborhoods. This is accomplished by a different grouping of the two sectors being written simultaneously (W and S, where the double write operations are performed on sectors W and S on rows (1, Traversing arcs which require duplex transitions in the (S, E), the (E, N), or the (N, W ) sectors, will be handled similarly and each would require 2r double write cycles and 2r verification read cycles for a total of 4r cycles. However, traversing arcs which require duplex transitions in the (S, N) , or the (E, W ) sectors, needs three phases and therefore requires 3r double write cycles and 3r 2), (3, 4), (5, 6) , ... verification read cycles for a total of 6r cycles. This is due to the fact that sectors of these groups fall on the same row andl column which requires slightly different handling. Thus, for duplex transitions applied to the N, S (or E, W) deleted nejghborhood cells, i.e. traversing the arc O X X O t) lXXl (XOOX t) Xllx), one phase is required to sensitize DDNPSF faults for the n, s (e, w) base cells;, a secondphase for the w (n) base cells and a thirdphase for the e (s) base cells. A finite stale machine (FSM) is designed to step through ithe EES and determine the next arc traversal operation to be carried out, based on the current content of the deleted neighborliood. For a minor increase in the length of the test sequence, the complexity of the FSM can be reduced. To achievie this, the optimal Eulerian subtour2 shown in Figure 6@ ) is replaced with subtour2* shown in Figure 8 and Table 4 . Subtour2* starts at vertex A by traversing the three bi-directional arcs emanating from A, i.e., AH, AG, and AF. Moving to vertex B (using the added link shown in Figure 8 as a dotted line AB) the bidirectional arcs emanating from B, i.e., BH, BG, and BE are traversed. Then move to C and so on. The extra links added to subtour2* (AB, BC, and CD) would slightly increase the test sequence length but will lead to a simpler implementation of the FSM. To point at the target cells, x, y E {IV,E:, W,S) for the next double write operation, a 4-bit register, NEWSREG, is used. The bits of NEWS-REG have a one-to-one correspondence with the N,E, KS sectors siuch that a given bit is 1 if and only if the corresponding sector is to undergo a transition write operatiion. Thus, NEWS-L%=1001 specifies double transition write operations into sectors N,S. As illustrated by the following example, the contents of NEWS-REG are simply determined using two additional 4-bit registers, R1 and R2. The added cost for this simplicity is the slight increase in the length of subtour2* as compared to subtour2.
Extended
Example: At vertex A (Figure 8) (N,E,KS) = (1110).
To reach vertex H, (,V,E, K@ = (1000), negative double transition writes to sectors E , W are performed (i.e., NEWS-REG should be 0110). This is achieved by the following: simple procedure. The deleted neighborhood pattern corresponding to the starting point A is complemented (OOOl), and loaded into R1 and into R2 which is used as a mask. The following two steps produce the required value of NEWS-REG: a) step 1: Cyclic Shift Rght R1 b) step 2: NEWS-REG = R1 NOR R2
At this point NEWS-REG contains the correct value, 01 10, indicating that E, W are the target sectors for double transition write operations required to traverse the bidirectional arc AH. Repeating steps 1&2, NEWS-REG will contain 1010 providing the locations for the transition writes required for the next bi-dxectional arc AG. T h s is true for all bi-directional arcs emanating from A, and is also true for each of B, C, and D provided that the proper value of the starting point is used (i.e., 0010 for B, 0100 for C, and 1000 for D). As shown in Figure 9 , after traversing all bi-directional arcs emanating from A, R1 automatically contains the starting point for vertex B (and so on). Moreover, the value of NEWS-2EG to effect an internal link, e.g. AB, is obtained by simple ORing of the contents of R1 and FU. Figure 10 shows pseudo-code for the partial testing procedure handling subtour2 * .
The same hardware can be used in a similar manner to step through subtourl. Using the arc traversal procedure to traverse arcs of subtourl, however, requires an adktional link between vertices a (0000) and h (1111). The DDNPSF graph for the modified subtourl, designated subtourl*, is shown in Figure 11 and the corresponding pseudo-code is shown in Figure 12 .
Data written into the two ports in the array test mode are specified using a two-bit register DATA-REG (1 -bit per port). It is noted that data written while traversing the bi-directional arcs within subtourl* and subtour2* are identical for both ports. This is because the double Testing procedure for Subtourl* iputs of the comparators using one verification rea cycle. This is followed by a sequence of q+I double write and verification read cycles to fully test both comparators.
The pseudo-code for the complete memory array test procedure is shown in Figure 13 . Each initialization step (e.g., step 1 or step 5) is performed in 2r cycles, since two sectors will be written by both ports per cycle. The link operation of step 3 is performed in r/2 double write operations to maintain simple BIST logic. Thus, a total of 
ADDRESS DECODER FAULTS
According to the DPRAM model, each port has its own row decoder and column decoder. For fault-free decoder operation, each address in the address space should access one and only one memory cell which is not accessed by any other address. In addition, each cell should be This implies that each port accesses a total of r distinct rows. In addition, one and only one row line is accessible by each valid address, and conversely, each address accesses exactly one row line. Moreover, the above mapping functions guarantee that each row of array memoiry cells is accessible from both ports by the same unique: address whether the accessing port is a or b. Similar definitions and mapping functions apply for the column address decodler as well.
There are six types of expected DPRAM decoder faults. These can be broadly classified into two main categories: single-port decoder faults (SPDF) and cross port decoder faults (CI' DF). a) Single-port Decoder Faults (SPDF): these include four types of faults typically found in SPRAMs: 1. A row (column) line is not accessed by any address. 2. A row (column) address that accesses no row (bit) line. 3. One address which accesses more than one row (column) line (one to many). 4. A row (column) line accessed by more than one address (many to one). b) Cross Port Decoder Faults (CPDF): result from interaction between the two access ports:
5. An address of one port accesses a row (column) line belonging to the other port. 6. Address mismatch faults, where the same address to both ports selects non-matching row (column) lines. Noine of the singleport decoder faults can stand alone [I], but rather a combination of such faults will exist together. For cross port decoder faults, only the last fault (address mismatch) can stand alone. Possible fault combinations are shown in Figure 14 . Following the same notation used in [l] , fault type A combines faults 1 & 2, Fault 5 maps a row address of one port into the row line belonging to the other port. This, however, does not mean that cells on the erroneously selected row are accessible from the first port. This is due to the fact that to access a cell, both its row and column lines should be selected by the same accessing port. In case of fault 5, only the row line is erroneously accessed by the first port, while the column line is accessed by the second port. but also into cell (i,k) accessed for reading by port a. Data read by port a will depend on the memory type. In case of DRAMS, the data read will be some stuck-at value. For SRAMs, data read will be noise and layout dependent. A similar argument holds for fault 5 on column decoders. Figure 15 shows the pseudo-code for this test procedure. The notation Kead-x-aQ,J) [ Wrzte-x-a(l,j) ) is used to indicate a read (write) operation with expected (input) data x (x E {0,1}) through port a ( a E{a,b}) from (to) the memory cell whose row and column addresses are i a n d j respectively.
Decoder Test Procedures
The procedure initializes the memory cells of two distinct columns by the two ports to some background data (all 0's or all 1's). After initialization, the memory cells of both columns are scanned simultaneously by both ports in opposite row address directions where the background data are verified and the complement data are written. This amounts to performing two independent march tests of a total length 5r on these two columns simultaneously. Thus, the test procedure fully detects fault types A, B, C, and I) [l] . This procedure also detects fault types E, F, G, H and I except for the case where fault 5 corresponds to some address x of one port mapping into row r-x-I (or column q-x-I) on the other port. In addition, address mismatch faults are not detected by this procedure. Such faults will be detected by the second test procedure. 4.2.2 Row Decoder Test Procedure 2 In this procedure, the test is performed only on a single column which is accessed by both ports. This column is first initialized to some background data (all 0's). The column cells are then scanned in an ascending order of row addresses, verifying their contents through read operations by both ports. Then, one port is used to write complementary data which is then also verified by a read operation from both ports.
The test procedure verifies that data written to some row address by any port is also readable by both ports using the same address. For test regularity, this test procedure uses the double write operation used in the decoder-1 procedure. Thus, in addition to the test column (I), some other dummy column (m) is written into. This regularity leads to a simpler BIST logic implementation. The pseudo-code of this procedure is shown in Figure 16 . In addition to detecting address mismatch faults, this procedure detects faults of type E, F, G, H and I which escape detection by the previous procedure.
For the row decoder, the first procedure takes 5r cycles while the second procedure takes 6r cycles for a total of 1 lr cycles. Similarly column decoder test of length 1 Iq will be required for a total of ll(r+q) cycles for both decoders which is of order O(&).
Since the array test algorithm requires (46% + q + 2) read and write cycles, the total number of read and write operations required for both the array and decoder tests is (479 r + 12 q + 2). For p = 1 and r = q = fi , this amounts to (491fi + 2) readwrite cycles. 
