One-transistor-cell 4-valued universal-literal CAM for cellular logic image processing by 羽生 貴弘
One-transistor-cell 4-valued universal-literal
CAM for cellular logic image processing
著者 羽生  貴弘
journal or
publication title
Proceedings of the 27th International
Symposium on Multiple-Valued Logic, 1997
page range 175-180
year 1997
URL http://hdl.handle.net/10097/46903
doi: 10.1109/ISMVL.1997.601393
One-Transistor-Cell 4-Valued Universal-Literal CAM 
for Cellular Logic Image Processing 
Takahiro Hanyu, Manabu Arakaki and Michitaka Kameyama 
Department of Computer and Mathematical Sciences 
Graduate School of Information Scienccs 
Tohoku University 
Aoba, Aramaki, Aoba-ku, Sendai 980--77, Japan 
hanyu@kameyama.ecei.tohoku.ac.jp 
Abstract 
A non-volatile 4-valued content-addressable memory 
(CAM) is proposed for fully parallel template-matching 
operations in real-time cellular logic image processing 
with fixed templates. A universal literal in each CAM 
cell is used to compare a 4-ualued input pixel with a 4-  
valued template pattern. Any CAM cell functions are 
performed by a pair of a simple threshold operation and 
a logic-value conversion which a8 shared by CAM ce1l.s 
in the same column of a C A M  cellular array. More- 
over, the use o f ?  single Boating-gate MOS transistor 
makes it possible to implement a universal-literal cir- 
cuit together with a 4-valued storage element. As a re- 
sult, a high-density d-valued universul-literal CAM with 
a single transistor cell is  designed by using a multi-layer 
interconnection technology. Its performance as much 
superior to that of conventional CAM-based implemen- 
tations under the same dynamic power dissipation. 
1. Introduction 
In intelligent robot systems and real-time instru- 
mentation and control systems, i t  is important to per- 
form real- time cellular logic image processing which 
requires highly parallel template-matching operations 
with many templates [1],[2]. Some hardware acceler- 
ators have been already developed for highly parallel 
cellular logic image processing[3]. However, various op- 
erations for cellular logic image processing make their 
circuits complicated, which causes a limited application 
with a smaller size of an image. 
On the other hand, it has been well-known that 
CAMs are suitable as a hardware accelerator for highly 
parallel processing with single instruction multiple data  
streams(4). From this point of view, multiple-valued 
(MV) universal-literal CAMs have been proposed for 
highly parallel templake-matching operations [5],[6]. 
These MVCAMs are useful as a hardware accelerator 
in high-speed non-numeric processing systems such as a 
cellular logic image protcessing system with many tem- 
plates because of their compact implementation based 
on multiple-valued logic. However, towards the next- 
generation real-world applications such as vision sys- 
tems, it is important to1 develop a much higher-density 
universal-literal MVCA.M. 
This paper presents a new non-volatile 4-valued 
CAM with a single-transistor CAM cell for fully par- 
allel cellular logic image processing. Since gray-level 
or colored pixel values are directly represented by MV 
data in cellular logic image processing, a template- 
matching operation with an N x N window is per- 
formed by multiple-valued logic operations, that  is ‘uni- 
versal literals’[7] ,[8]. 
A universal literal is described by combination of 
2 window literals, so that it requires 4 threshold 01)- 
erations which make a universal-literal circuit compli- 
cated. In this paper, a universal literal is represented 
by a down literal with permutation of a 4-valued input 
signal, called a ‘logic-value conversion’ (LVC). Since a 
down literal is performed by a threshold operation with 
a single threshold, the proposed CAM cell function be- 
comes simple. 
Moreover, a threshold-operation circuit is easily de- 
signed by only a floating-gate MOS transistor whose 
threshold voltage is programmable by controlling the 
charge on the floating gate[9]. Consequently, the suc- 
cessive universal-literal CAM cell circuit can be de- 
signed by only a single floating-gate MOS transistor 
whose threshold voltage level corresponds to a stored 
value of a 4-valued template. 
175 
0-8186-7910-7/97 $10.00 0 1997 IEEE 
Authorized licensed use limited to: TOHOKU UNIVERSITY. Downloaded on February 18,2010 at 20:26:50 EST from IEEE Xplore.  Restrictions apply. 
EL image 
................................................................................................. 
CAM cellular an 
A0 
q+ 
.................................... 
,output 
S t r e Q m  
Figure 1. Structure of a CAM-based image pro- 
cessor. 
In fact, the access time and the cell area of the pro- 
posed 4-valued 1Mb universal-literal CAM are about 
47(pm2) and 11.8(nsec)/word, , respectively, with the 
power dissipation of 4.7(mW) a t  the supply voltage of 
5(V). These performances are much superior to that of 
other CAM-based implementations. 
2. Basic 4-valued universal-literal CAM 
organization 
In this section, we discuss about an overview of a 
high-speed cellular logic image processor with fully par- 
allel template-matching capability. 
2.1. Overall universal-literal CAM struc- 
ture 
In the cellular logic image processing, a digital im- 
age is uniformly sampled and quantized to several lev- 
els. Let the set of discrete quantities in 4-valued im- 
ages be L = {0,1,2,3}.  As shown in Figure 1, the 
proposed cellular logic image processing system is per- 
formed in a pipelining manner where a universal-literal 
4-valued CAM is used as a hardware accelerator for 
parallel template-matching operations. 2-dimensional 
Figure 2. Realization of a universal literal. 
input image data  are transformed into serial data  ac- 
cording to line scanning. 9 pixels corresponding to a 
3 x 3 window are picked up from the line-scanned image 
data simultaneously, arid are entered into the CAM cel- 
lular array for parallel template-matching operations. 
A simple near-neighbor operation in the cellular 
logic image processing is generalized by a template- 
matching operation between a window of an input 
image and templates with the same window size. 
Since enormous templates are required in some im- 
age processing, several templates are compressed by 
using a simplification technique of MV logic func- 
tions[lO]. Using these compressed templates, the num- 
ber of template-matching operations can be greatly re- 
duced. Universal literals are used to perform template- 
matching operations with MV compressed templates. 
In the following discussion, we describe the formu- 
lation of 4-valued template-matching operations based 
on universal literals. 
2.2. 4-valued universal literal for MV pat- 
tern matching 
A universal literal is one of the basic components 
for MV pattern-matching operations with compressed 
templates and is defined as 
3 i f X C a  
0 otherwise 
X b )  = 
where a c L. 
A universal literal can be expressed by combination 
of 2 window functions. For example, Figure 2 shows 
a 4-valued universal literal X{’13} .  If an input stream 
X = (0 ,1,2,3)  of X { ’ . 3 }  is permuted by (1 ,3,0,2) ,  the 
resulting logic function can be represented by a simple 
threshold function DI (X) where a threshold function 
D,,(X) is defined as 
3 i f X S a  
D,(X) = { 0 otherwise 
where a E L. The permutation of an input stream is 
called ‘logic-value conversion’(LVC). 
176 
Authorized licensed use limited to: TOHOKU UNIVERSITY. Downloaded on February 18,2010 at 20:26:50 EST from IEEE Xplore.  Restrictions apply. 
Table 1. Synthesis of 16 logic functions for a universal1 literal. 
I Threshold Functions I LVC I Threshold 11 Functions I LVC 
Let (0,1,2,3) be an input string of a 4-valued LVC 
f .  Let (po1pl,p2,p3) be an output string of f corre- 
sponding to the input string. The LVC f is defined 
as 
f =< p O , p l , p 2 , p 3  > (3) 
where p;(O 5 i 5 3) E L. To realize a 4-valued univer- 
sal literal, we must prepare 6 LVCs, f l ,  f 2 ,  f 3 ,  f4, fs 
and f(3 which are defined as 
' fl = < 3 ,2 ,1 ,0> ,  
f 2  = < 0,3 ,2 ,1  >, 
f 3  = < 1,0,3,2 >, 
* f 4  = < 2,1 ,0 ,3  >, (4) 
fs = < 2,0,3,1 >, 
and f ( 3  = < 1,3,0,2 > .  
Table 1 shows 16 pairs of an LVC and a threshold 
function which correspond to 16 kinds of  functions ge~i- 
erated by a 4-valued universal literal. As shown in this 
figure, it  is clear that  a 4-valued uriiversal literal is re- 
alized by a pair of an LVC and a threshold function. 
3. Universal-Literal CAM with One- 
Transistor-Cell Structure 
In this section, we discuss about a design of a simple 
MVCAM cell circuit using a floating-gate MOS tran- 
sistor, and an overall structure based on the proposed 
MVCAM cellular array. 
3.1. One-transistor CAM cell circuit 
According to Section 2, a universal literal can be 
represented by combination of a single threshold oper- 
ation and a kind of LVC. Figure 3 shows a circuit de- 
sign and a layout of a 4-valued universal-literal CAM 
LVC 
Match line 
: Floarting-gate 
MOS transistor 
b...... 
Threshold-operation circuit 
Figure 3. C A M  cell circuit. 
cell circuit with only a single floating-gate MOS transis- 
tor. 1 of 6-LVC output signals generated from an input 
signal is selected by mask programrning in each CAM 
cell. A universal literal can be realized by a threshold 
operation with a selected LVC output signal in each 
CAM cell. Since the threshold voltage of a floatixig- 
gate MOS transistor can be programmed by controlling 
the charge on its floating gate, a threshold operation 
with a one-digit 4-valued storage element can be per- 
formed simultaneously by using a single floatirig-gate 
MOS transistor. 
Tables 2 and 3 show the relationship among 4-valued 
input logical values, 4-vdued threshold values and their 
corresponding voltage levels. Since 6 LVCs are shared 
by every CAM cell in the same colunin of a CAM c:ellu- 
lar array, each CAM cell function is enough to perform 
only a single threshold operation with mask program- 
ming of LVCs. As a result, the successive CAM cell 
can be designed by only a single floating-gate MOS 
transistor. 
For example, Figure 3 shows a design of a universal 
literal X{' l3}  which is constructed by the LVC f 2  arid 
177 
Authorized licensed use limited to: TOHOKU UNIVERSITY. Downloaded on February 18,2010 at 20:26:50 EST from IEEE Xplore.  Restrictions apply. 
Table 2. Relationship between logical values 
and voltage levels. 
Inuiit lwical values I 0 I 1 I 2 I 3 
c, I I I I 
Voltage levels(V) I 0.0 I 1.0 1 2.0 1 3.0 
Table 3. Relationship between logical values 
and voltage levels. 
Threshold values I 0 I 1 I 2 I 3 
I I I I 
Voltage levels(V) I 0.5 I 1.5 I 2.5 1 3.5 
the threshold function Dl ( X )  as shown in Figure 2. 
3.2. One-word CAM structure 
The use of the wired AND technique makes it possi- 
ble to generate the product of 9 universal literals corre- 
sponding to a 3 x 3 template without additional tran- 
sistors. Figure ‘4 shows a design of a one-word CAM 
circuit based on 9 floating-gate MOS transistors. First, 
the match line of the one-word CAM circuit is previ- 
ously precharged to a high level (Vr,o) by using the 
NMOS pass t r a n h o r  M ~ .  If an input pixel pattern is 
matched to  the template pattern, all the 9 floating-gate 
MOS transistors are turned off. As a result, the q a t c h  
line still remains a high level. Otherwise, a floating- 
gate MOS transistor is a t  least, turned on, resulting in 
pulling down the match line to a low level. 
Consequently, the voltage level Vml, of the match 
line represents the relationship between an 9-pixel in- 
put vector X = (x i ,  x2,. . . x9) and a template-pattern 
vector P = (pl,p2,.-.pg), which is written as 
( 5 )  
X = P if VMI,  is high 
3.3. LVC circuit 
Figure 5 shows a circuit diagram to produce 6- 
LVC output signals. The NMOS transistors MI - 
M8 in a decoder are used as the components of bi- 
nary inverters and NOR circuits with different thresh- 
old voltages realized by multiple ion implants where 
the threshold voltages of M I ,  M3, M4, M6 and MT are 
O.OV, l.SV, l.SV, 3.5V arid 3.5V, respectively arid where 
the other transistors have the same threshold voltage 
1.OV as the conventional eIihancernerit-niode one. The 
Precharge-control signal 
1 VDO 
[ F 
Mask program 
Figure 4. One-word CAM circuit. 
Decoder --- 
r------ 
4-valued input 
\6 
H 
H 
n 
/,-d. 2 I . @  f,l<O. 3.2. I >  6 - 4 . 0 . 3 . 2 ,  f,-<2. I ,  0.3, /,-d. 0.3. I> & - < I .  3 . 0 . b  
Figure 5. Logic-value cdnversion circuit. 
transistors Mg - M14 are the depletion-mode NMOS 
ones. 
A pair of 4-rail one-hot binary signals correspond- 
ing to a 4-valued input signal is generated from the 
decoder. Using the control signals, gate voltages of 
pass trarisistors whose inputs are directly connected to 
one of four different supply voltages. 6 kinds of LVCs 
are realized by programming the input signals to pass 
transistors. 
3.4. 1-Mb 4-valued universal-literal CAM 
design 
Figure 6 shows a layout and its floor plan of the pro- 
posed 1Mb 4-valued universal-literal CAM. The  perfor- 
mance of the proposed CAM is summarized in Table 
3. Since an LVC circuit is shared by 100 cells in the 
same column of MVCAM cellular array, the effective 
chip area of LVC circuits is limited to occupy about 
30% of a total chip area. 
178 
Authorized licensed use limited to: TOHOKU UNIVERSITY. Downloaded on February 18,2010 at 20:26:50 EST from IEEE Xplore.  Restrictions apply. 
Figure 7. 4-valued universal-literal circuit using 
binary CAM cells. 
8.2mm c 
:a)Layou t. 
4.2. Comparison of performances 
@)Floor plan. 
Figure 6. 1M-bit CAM. 
4. Evaluation 
To evaluate the performance of the proposed MV- 
CAM, we discuss about different approaches based on 
a binary CAM and previously proposed MVCAMs, re- 
spectively. 
4.1. Universal-literal circuit using conven- 
tional binary CAM cells 
Highly parallel template-matching operations for 
cellular logic image processing can be also performed 
by the hardware based on other CAM structures. Fig- 
ure 7 shows design of a 4-valued universal-literal CAM 
cell using binary dynamic CAM cells. Since 1 of 4 lev- 
els is programmed by a single binary CAM cell, the 
number of CAM cells to perform a 4-vdued universal 
literal becomes 4. Moreover, a binary dynamic CAM 
cell is designed by 5 transistors and 2 capacitors (111. 
Table 4 summarizes ithe comparison of the template- 
matching circuits using CAM-based architectures with 
a 4-valued 3 x 3 template under a 0.8-prri standard 
EEPROM technology. Although a window size is a 
3 x 3 in this evaluation, i t  can be easily extended to 
any sizes. 
Due to the poor funlotionality of binary CAM cells, 
binary-CAM based hardware requires more than 8- 
times larger chip area than that of the proposed 4- 
valued universal-literal CAM.: In proportion to the 
number of cell transistors, the cell areas of conven- 
tional universal-literal CAMs are larger than that of 
the proposed hardware., 
The access time and the power dissipation of CAMs 
depend on total capacitance of the match line. Since 
the proposed CAM cell is designed by less transistors 
in comparison with those of other CAMs, the length of 
the match line becomes the shortest arid the line capac:- 
itance becomes the lowest of all the other implementa- 
tion. In fact, the access speed of the proposed CAM is 
about twice, 1.8 and 1.3 times faster than that of a bi- 
nary CAM, 4-transistoi:-cell and 2-transistor-cell MV- 
CAMs, respectively under PSPICE simulation. More- 
over, the power dissipation of the proposed hardware 
is evaluated to be reduced to about 56 percent, 59 per- 
cent and 79 percent compared with that of a binary 
implementation, 2 conventional 4-valued implementa- 
tions, respectively. 
From the viewpoint of the above performance eval- 
uation, it is clear that the proposed one-transistor-cell 
$-valued CAM has the best performance of all the other 
CAMs for highly parallel cellular logic image process- 
ing. 
179 
Authorized licensed use limited to: TOHOKU UNIVERSITY. Downloaded on February 18,2010 at 20:26:50 EST from IEEE Xplore.  Restrictions apply. 
Layout area 
Power dissipation (lword) 
Access time 
5. Conclusion 
Binary Conventional Conventional Proposed 
CAM-based 4-valued CAM 4-valued CAM CAM 
implementation (4-Tr. Cell) (2-Tr. Cell) (1-Tr. Cell) 
377prn2 125pm2 107pm2 47p7n2 
8.4mW 8.0mW 6.0mW 4.7mW 
23.6ns 21ns 15.7ns 11.8ns 
w .  151 (Ref- 161 ) 
In this paper, a new 4-valued universal-literal CAM 
with a single transistor cell has been proposed for fully 
parallel template-matching operations in cellular logic 
image processing. A universal literal in each CAM cell 
is performed by the decomposition of an LVC and a 
threshold function, which makes a CAM cell function 
simple. Moreover, the use of floating-gate MOS transis- 
tor results in a one-transistor CAM cell design whose 
performance is evaluated to be the highest of all the 
other CAM-based implementations. 
As a future problem, it is also important to re- 
duce the dynamic power dissipation of the proposed 
universal-literal CAM. The main dynamic power of the 
CAM is consumed by discharging the output capacitive 
load of the m a t c h h e  when the corresponding word is a 
‘mismatch’ state, while no dynamic power is consumed 
by the match line when the corresponding word is a 
‘match’ state. Since almost all the template-mat6hing 
results are ‘mismatch’ in the CAM, the most output 
capacitive loads of the match line must be discharged. 
If a NAND-type structure is used for a one-word CAM 
circuit instead of a NOR-type structure in the proposed 
CAM, only the output capacitive load of the match line 
is discharged in the match-state word. As a result, it 
is expected that the use of the circuit technique makes 
the power dissipation reduced greatly. 
References 
11) M. M. Trivedi, C. Chen and S. B. Marapane, “A 
Vision System for Robotic Inspection and Manipula- 
tion,” IEEE ”Is. Commun., COM-22, No.6, pp.91- 
97, Jun. 1989. 
[2] T. Hanyu, M. Kuwahara and T. Higuchi, “Low-Power 
8-Valued Cellular Array VLSI for High-speed Im- 
age Processing,” IEICE Trans. Electoron., vol. E77-C, 
No.7, July 1994. 
131 M. Kameyama, T. Hanyu and T. Higuchi, “Design and 
Implementation of Quaternary NMOS Integrated Cir- 
cuits for Pipelined Image Processing,” IEEE J. Solid- 
State Circuits., vol. sc-22, No.1, pp.20-27, Feb. 
141 K. E. Grosspietsch, “Associative Processors and Mem- 
ories : A Survey,” IEEE Micro, V01.12, No. 3, pp.12-19, 
June 1992. 
T. Hanyu, M. Arakaki and M. Kameyama, “Quater- 
nary Universal-Literal CAM for Cellular Logic Image 
Processing,” Proc. of 1996 Int. Symp. on MVL, pp.224 
229, May 1996. 
T. Hanyu, M. Arakaki and M. Kameyama, “2- 
Transistor-Cell 4-Valued Unhersal-Literal CAM for a 
Cellular Logic Image Processor,” IEEE Int. Sofid-State 
Circuits Conf. (to be published in Feb. 1997). 
K. C. Smith, “The Prospects for Multiplevalued Logic 
: A Technology and Applications View,” IEEE Trans. 
Comput., v01.c-30, pp.619-634, Sept. 1981. 
T. Higuchi and M. Kameyama, “Multiple-Valued Dig- 
ital Processing System,” Shokodo Co. Ltd., Tokyo, 
1989. 
T. Blyth, S. Khan and R. Simko, “A Non-Volatile Ana- 
log Storage Device Using EEPROM Technology,” in 
Dig. IEEE Int. Solid-state Circuits Conf., TPM11.7, 
pp.192-193, Feb. 1991. 
M. Kameyama, K. Suzuki and T. Higuchi, “Image Pro- 
cessing Algorithms for a Multiple-valued Array Pro- 
cessor,” Proc. of the 1983 Int. Symp. on Multiple- 
Valued Logic, pp. 236-241, May 1983. 
T. Yamagata, et. al., “288-kb Fully Parallel Con- 
tent Addressable Memory Using a Stacked-Capacitor 
Cell Structure,” IEEE J. Solid-state Circuits, vol. 27, 
No.12, pp.1927-1933, Dec. 1992. 
180 
Authorized licensed use limited to: TOHOKU UNIVERSITY. Downloaded on February 18,2010 at 20:26:50 EST from IEEE Xplore.  Restrictions apply. 
