Abstract-This paper presents a design strategy for efficient and comprehensive testing of embedded dynamic content addressable memory (DCAM) where the address, data, and hit lines are not externally controllable or observable. Based on the design for testability approach, three algorithms are developed for testing common functional faults in CAM's. The first algorithm provides a new method of detecting pattern, sensitive faults over a neighborhood size of nine and therehy tests a w-word CAM in 33w + 2b + 64 operations, where b is the number of bits in a word. The proposed algorithm is significantly more efficient than other embedded procedures for testing pattern-sensitive faults. The embedded CAM can be tested for pattern-sensitive faults by using an extra 2w + 25 transistors without the need for a signature analyzer. Two additional simple algorithms are given in the Appendix for testing embedded CAM's for stuck-at and adjacent-cell coupling faults.
I. INTRODUCTION N INCREASING number of functional devices
A within VLSI chips are becoming inaccessible to external testers. As a result, new testable design strategies are necessary to grapple with the testing difficulties in embedded applications. Memory devices are extensively used in VLSI applications, and testing of embedded random access memory (RAM) devices has been studied by several researchers [ 11-[4] . However, as is described in this paper, these proposals for testing embedded RAM's are not appropriate for content addressable memory (CAM). This paper examines the problem of testing CAM's in an embedded environment, where the address and control lines of the CAM are not directly controllable and the data and hit lines are not externally observable.
The paper demonstrates that a CAM can be tested in a far simpler manner than a RAM. The reading operations in a CAM can be replaced by one content addressable search operation, and instead of using a signature analyzer, as is typically proposed for built-in self-test of RAM's [l] , [3], simple hardware can be used to monitor the match lines in a CAM. In contrast to the signature analyzer approach, simple testable designs are proposed in this paper which require at most 2w + 25 transistors in a w-word CAM of any arbitrary word size. The paper pro- poses three methodologies to test for stuck-at faults, adjacent cell coupling faults, and pattern-sensitive faults by using simple testable hardware. The contributions of the paper include: i) a simple testable design for CAM's, ii) algorithms for testing for three types of functional faults, and iii) an approach to testing CAM's utilizing a single associative search operation as opposed to the conventional w read or associative search operations [ 5 ] , [ 6 ] . Section I1 of the paper introduces the dynamic CAM (DCAM) design illustrating its different modes of operation. This is followed by the design for testability technique which can be used to test the embedded CAM's by three simple test procedures. The new algorithm for testing embedded CAM's is presented in Section 111, followed by a built-in self-test (BIST) implementation of the algorithm in Section IV.
TESTABLE CAM DESIGN A. CAM Overview
A typical organization for content addressable memory is shown in Fig. 1 . A b-bit, w-word CAM is organized as an n = b X w two-dimensional array where all word lines (rows) and bit lines (columns) can be accessed randomly and in parallel. Each memory cell is connected to one pair of horizontal lines (comprising a row) and one pair of vertical lines (comprising a column). A memory cell is denoted by CO if it is located at the cross point of an ith bit line and ajth word line and is connected to the horizontal lines WJ , V, , and to the vertical lines Di, E,. Lines D, and E, are the ith BIT and BIT lines driven by the sense amplifiers. Line W, is thejth word line and is selected by the word line decoder, which decodes the content of the log, w-bit memory address register (MAR). Line V, is derived by the control cell (CC,) and its value depends on the logical value of WJ and the R / W , called the read/write line.
The basic memory cell, shown in Fig. 2 In addition to the above modes, an internal masking of cells occurs if both the cells contain a 0 during a read operation [7] . Here, no current will flow through D, and E,, and the reading will be masked.
The operation of writing a 0 or a 1 on a memory cell will be denoted by, respectively, WO and Wl. The reading operation will be denoted by R. If the state of the memory cell does not change, then the write operation is called -nontransition write. On the contrary, if the state of the cell changes due to writing 1 (0), it is called transition write 1 (0) and is denoted by t ( 1 ). In the search operation, all the cells in the selected word lines can be compared with a bit pattern in the search register. The content of the search register is denoted by a regular expression, e.g., if the content of the search register is (010101 * -
B. Design for Testability
The testable design proposed here exploits the fact that test procedures often generate a test pattern which repeats after each p-word line by doing mutual comparison of every p word lines. Thus, the proposed technique replaces the w read/associative search operations by a single associative search. Giles and Hunter [5] have demonstrated how to modify the HIT LINE detector to augment the testability of stuck-at faults in a CAM. In this paper a similar approach has been adopted to test a large class of functional faults, and the odd word lines and the even word lines have been grouped into two different classes (i.e., p = 2 ) . The HIT LINE detector in Fig To, * * * , T7 are used for precharging by precharge clock
4P*
The remainder of the paper demonstrates how, by using the testable hardware, pattern-sensitive faults in a CAM can be detected. Test procedures for simple stuck-at and bridging faults are given in the Appendix. The procedures in the Appendix are modifications of those proposed by Jain and Stroud [l] for testing embedded RAM'S. In the following section a new algorithm is proposed for builtin self-test of the pattern-sensitive faults in DCAM's.
>, it is denoted by SR = (01)".
-, Q , -

PATTERN-SENSITIVE FAULTS IN EMBEDDED
DYNAMIC CAM A pattern-sensitive fault models the adjacency effect between a memory cell, called the base cell, and its physically neighboring cells. It is typically due to leakage effects between a memory cell and its adjoining cells in the presence of a particular data pattern in the neighboring cells [8], [9] . If a write operation is faulty due to the presence of a particular bit pattern in the neighboring cells, then the fault is called a static pattern-sensitive fault (SPSF). On the other hand, if the content of the base cell changes due to a WRITE operation in one of the neighboring cells and other neighboring cells hold a fixed pattern of 0's and l's, then the fault is called a dynamic patternsensitive fault (DPSF). Since any memory cell can be arbitrarily masked in a CAM, both SPSF's and DPSF's may occur. The problem of detecting SPSF's and DPSF's over a neighborhood of nine cells is addressed in this paper. Such a neighborhood is shown in Fig. 4 and is commonly called a 9-neighborhood. C , is the base cell. Cells C, * share the same word line (line j in Fig. 1 ) with the base cell and are called its word line neighbors (N,). Cells Ci,jfl share the same bit line (line i in Fig. 1 In order to detect all the static pattern-sensitive faults in the memory, every cell in the memory should make both t and 1 transitions in the presence of all 28 = 256 patterns in the memory. Thus overall 512 transition writes are necessary for each cell in the memory to detect the SPSF. In order to detect all the dynamic pattern-sensitive faults, each base cell should be tested by an ordinary read or associative read operation whenever a transition write is made over a cell in the neighborhood of a base cell. Since each cell can make two types of transition writes for all the possible binary patterns in the other eight cells and there are altogether eight neighbors of a base cell, there are altogether 2 X 28 x 8 = 4096 transition writes in the neighborhood of a cell to test all the DPSF's. This requires a large amount of test time.
In this paper, an approach has been adopted in which cells in 9-neighborhood are categorized into four logical groups, viz., the base cell, bit line neighbors, word line neighbors, and diagonal neighbors. It may be noted that in practice the faults occur due to leakages between the cells within the neighborhood. It has been found that the leakage is maximum when the symmetrically located cells contain the same bit patterns [8]. By the above classification, many unnecessary binary combinations are avoided. For example, let there be a situation when a read operation is made to verify the transition write t when its bit line neighbors Ci, + and Ci, -I contain 0. Clearly, at first, the bit line i will be precharged to some high potential. Now if any of the access transistors of the bit line neighbors is weak, then in the presence of 0's in the bit line neighborhood, the precharge level in the bit line will be degraded and the sense amplifier on the ith bit line will fail to detect a 1 in the base cell. Similarly, if there is a weak transistor in cell Ci + 1, which does not allow the base cell to make a t transition because C, + is at state 0, then it is enough to test the fault when the symmetrically located cell Ci -is also at state 0, since then the leakage effect will be predominant. Thus if a fault does not occur when both the cells Ci, are at state 0, then it will not occur even when C, + 1, is at 0 ( 1 ) and C, -is at l ( 0 ) . DPSF's associated with each cell in the memory, it can be easily seen that at most 4 x 24 = 64 transition writes on each cell will be necessary to sensitize all the SPSF's and DPSF's in a CAM,. In this section, it will be shown that by cleverly combining the transition write sequence over the neighborhood, the total number of transition writes on each cell can be reduced to 16.
In order to accomplish the minimal transition writes per 'memory cell, at first, each cell C, in the memory will be assigned a positive number k E { 0, 1, 2, 3 ) such that k = 2 ( i mod 2 ) + j mod 2. Thus, the memory cells are divided into four types of cells 0, 1, 2, and 3, as shown in Fig. 5 . It will be shown later that a transition write on a cell followed by a suitable sequence of read operations on the adjoining cells will sensitize four pattern-sensitive faults; thereby the number of transition writes on each cell can be reduced by a factor of 4.
In order to obtain a test procedure which needs only 16 oooo transition write sequences per cell, a graph theoretic approach will be used. The 4-tuple in Table 111 describes a state space of 16 nodes (states) where each node represents the binary pattern in the 9-neighborhood. These nodes are numbered 0 to 15, depending on the binary pattern in the 4-tuple. The base cell state is the MSB in the 4-tuple. By complementing the pth bit in the 4-tuple, the memory state of the neighborhood will change from k to k -2 p , if the pth bit changes from 0 to 1, and from k to k + 2p, if thepth bit changes from 1 to 0. All these transitions from one state to another state represent edges in the state space graph and the graph describes a fourdimensional cube as shown in Fig. 6 . Each undirected edge in Fig. 6 represents two antiparallel directed edges, and each directed edge corresponds to a transition write over all the cells having the same number, k E { 0, 1, 2, 3 } in a neighborhood. The set of thick edges corresponds to changing the state of the base cell and will pertain to sensitizing the SPSF's. Other edges pertain to sensitizing the DPSF's. A minimal number of write sequences which will traverse all the directed edges in the 4-cube will constitute an Eulerian tour. The problem of deriving an Eulerian tour over an n-cube can be performed in several ways, such as the recursive technique [ 101, [ 1 11, and from reflected gray code [12] . Actually, the technique of obtaining an Eulerian tour from the reflected gray code can be extended to any arbitrary gray code to generate any arbitrary Eulerian path. It can be easily seen that the sequence of operations shown in Table I11 represents an Eulerian tour over the 4-cube. Algorithm 1 uses these write sequences to test the SPSF's and DPSF's for every cell in the memory over its 9-neighborhood. Theorem I: Algorithm 1 detects all modeled static pattern-sensitive faults (SPSF) and dynamic pattern-sensitive faults (DPSF) in a memory array. It also detects all stuckat faults in the comparator and error detector logic.
Proof: From the cell assignments, the neighborhood relationships between cells having different numbers are shown in Table IV . In Algorithm 1, all cells numbered k E (0, 1, 2, 3 ) make an upward ( t ) and a downward (1) transition write in the presence of all binary patterns in all other cells whose numbers are not k. In Table I11 there are altogether eight upward transitions in cells numbered k for eight distinct binary patterns. Also, there are eight downward transitions in cells k for all eight distinct binary patterns. Hence all 16 operations to sensitize SPSF's are performed in Algorithm 1. Since after each transition write, an associative search is done, if any SPSF occurs, current will flow through the WORD (MATCH) line, which in turn will be detected by the parallel comparator and error latch. Also, because of the neighborhood relationship in Table IV , every transition write on cells numbered k will also sensitize the DPSFs for the other three cells for which the number is not k. For example, in operation #5 of Table 111, the state of the cells numbered 0 changed from 0 to 1 while the contents of the cells numbered 1, 2, and 3 remain the same to 1, 1, and 0, respectively. The succeeding associative search operation in line (8) of Algorithm l detects an SPSF in all neighborhoods for which the base cell is 0 and DPSF's in all other neighborhoods for which base cell is not 0 and the fault occurs due to transition in cell numbered 0. The effect of operation #5 on different neighborhoods is shown in Fig. 7 . Thus Algorithm l , which writes all the patterns in Table I11 over the entire memory, will sensitize both the SPSF's and DPSF's for every cell in the memory.
Since at lines (8) and (10) of the memory cells are tested by observing and comparing the MATCH lines of different memory cells, comparator logic of each memory cell will be tested simultaneously. Since Algorithm 1 tests all the patterns of Table  111 , the comparator logic will be tested correctly for matching operations if there is no SPSF and DPSF in any memory cell. But the mismatch in associative search operations cannot be detected by lines (8) and (10). Since the MATCH line is a wired OR of different memory cells on a single word line, each memory cell in a word line will be tested independently. In lines (2) through (9, the memory is tested with a marching pattern of 1's when all the cells of the memory are initialized to 0. Since the pattern is different in only one bit position for every word in the memory, it will sensitize faults in the comparator logic for mismatch operations, and in every word line there will be a mismatch. Similarly, in lines (13) Proofi w operations are needed to initialize the memory in line (1). Lines (2) through (5) and (13) through (16) test in 2b operations the mismatch in an associative search for each memory cell. There are altogether 64 patterns in Table I11 and in each pattern there is only one transition write on a cell. In lines (7) and (9), w/2 word lines are written with each pattern in Table 111 . In lines (8) and (lo), a single associative search is made to test the contents of the CAM after each pattern in Table I11 is written on the relevant word lines. Thus altogether 32w + 64 operations are made to write all the 64 patterns in Table 111 .
Hence, the overall complexity of Algorithm 1 is 33w + 2b + 64. 
0
But for relatively large size CAM's, this will require an increased write cycle time and will increase power consumption.
It may be noted that the testable hardware in Fig. 3 is only tested partially by Algorithm 1. When all the even and odd lines match, the n-type transistors Qo, , Qzm will be tested for stuck-at 1 faults. When all the even and odd lines mismatch, all the p-type transistors Po, -, Pzm will be tested for stuck-at 0 faults. The rest of the stuck-at faults in the testable hardware can be tested by the following procedure. Test Procedure For Additional Hardware (1) Initialize the memory with all 0's.
(2) Set (SR) = (00) * and check if ERROR = 0. In this section only one algorithm is presented to detect the symmetric pattern-sensitive faults. The rationale for symmetric pattern-sensitive faults is to utilize the knowledge of the leakage currents in the CAM cell and to transform the neighborhood consisting of nine physically adjacent cells into a logical neighborhood of size 4. However, these leakage currents are unlikely to occur in static CAM design and, therefore, a slightly different algorithm can be used to test the pattern-sensitive faults in a static CAM. In [13] an alternative testing algorithm has been designed to test the PSF's in static CAM. The PSF's in [13] are different from those discussed in this paper in the sense that instead of testing the transition writes in the presence of all possible patterns in the neighborhood, they test whether a 0 and a 1 can be stored and held in the base cell in the presence of all valid patterns in the neighborhood. The algorithm tests a w-word CAM in 33w + 2b + 160 steps and detects PSF's over a physical neighborhood size of 25 cells [14] , [15] . This algorithm can be readily used to test the DCAM discussed in this paper. By modifying the address decoder such that all word lines having j (mod 5 ) same (where j is the address of a word line) are written in parallel, the complexity can be improved to 325 + 2b operations, where a constant 325 operations are needed to test the pattern-sensitive faults and 2b operations are used to test the comparator logic in the b-bit-wide CAM design.
In addition, two conventional commercial test procedures for reduced fault models can be adapted (see the Appendix) to the testable CAM without the need for sig-0. 1 nature analyzers which are commonly used for embedded applications. The overhead (i.e., the additional test logic with the associated routing) involved in the conventional techniques utilizing a signature analyzer in a dynamic RAM has been found to be between 3 percent and 5 percent of the total chip area [l] . The proposed technique needs only 2w + 25 extra transistors in a w-bit CAM; thereby the overhead is typically less than 1 percent for a 32-bit wide w-bit CAM. It may be noted that the presence of testable hardware obviates the need for a separate HIT LINE detector and thereby the overhead is considerably reduced.
IV. BIST IMPLEMENTATION OF ALGORITHM 1
The operations in Table I11 can be generated within the chip by using the circuit shown in Fig. 8 Table 111 . The remainder of the operations in the Table I11 are obtained from these four flipflops by selecting the suitable lines (complemented or noncomplemented output of flip-flops) and by reordering them. This is achieved by the set of multiplexers shown in Fig. 8 . Nontransition writes are disallowed by disabling the corresponding sense amplifiers, and the associated bit lines are set to high impedance. In order to write on the even or odd word lines a (logz w -1)-bit synchronous counter has been used and it is connected to the address lines of the word line decoder. The LSB of the address line is connected to 0 if the even word lines are to be accessed, or to 1 if the odd word lines are to be selected.
The BIST generator circuit can be further simplified by modifying Algorithm 1 slightly as described here. In Algorithm l , the sequence of transition writes was generated by describing an Eulerian tour over the symmetric 4-cube. The hardware overhead in Fig. 8 can be reduced over a factor of 2 by decomposing the Eulerian tour into eight disjoint Hamiltonian cyclic tours: (0, 1, 9, 13, 15, 14, 6, 2, O ) , ( 0 , 2, 6, 14, 15, 13, 9, 1, O ) , ( 0 , 8, 9, 11, 15, 7, 6 , 4 , O ) , ( 0 , 4 , 6 7 , 15, 11, 9, 8, O ) , ( 1 2 , 4 , 5, 7 , 3 , 11, 10, 8, 12) , (12, 8, 10, 11, 3 , 7 , 5 , 4 , 12) , (12, 13, 5, 1, 3, 2, 10, 14, 12), and (12, 14, 10, 2, 3, 1, 5, 13, 12 ) . The Hamiltonian cycles were described 4-cube are traversed and in this way all the 16 SPSF's and 48 DPSF's are sensitized. Initially, the memory is initialized to zero and the first four Hamiltonian cycles are performed as indicated above. Then the memory is reinitialized such that it contains a column bar pattern of 0's and 1's. This needs an additional w write operations, which for a 1K byte memory having a memory cycle time of 50 ns will take less than an extra 50 ps. But the economy in hardware is considerable [14] and is shown in Table V.
V. FINAL REMARKS
This paper proposes a testing strategy for embedded DCAM's. The strategy uses minimal extra hardware and can fit within the interceller pitch width of a DCAM. Two conventional commercial test procedures are easily adapted (Appendix) to the testable CAM without the need for the signature analyzers which are commonly used for embedded applications. Another advantage of the proposed strategy is that all the memory cells are associatively compared and tested by a single operation. This is in contrast to the w read operations typically required to test the content of a w-word CAM. Algorithm 1 is a test procedure which detects SPSF's and DPSF's over a 9-neighborhood which has been grouped into four classes. On the basis of physical reasoning and practical observations, nine cells in a neighborhood have been grouped into four classes, and the test Algorithm 1 tests the SPSF's and DPSF's in only 33w + 2b + 64 operations and in 97 + 2b operations if even and odd word lines are simultaneously accessed. Algorithm 3 takes 4w +2b +4 operations to test the stuck-at faults and the adjacent cells bridging faults.
