Current-Mode Techniques for the Implementation of Continuous- and Discrete-Time Cellular Neural Networks by Rodríguez Vázquez, Ángel Benito et al.
I32 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS-11: ANALOG AND DIGITAL SIGNAL PROCESSING. VOL. 40. NO. 3, MARCH 1993 
Current-Mode Techniques for the 
Implementation of Continuous- and 
Discrete-Time Cellular Neural Networks 
Angel Rodriguez-VBzquez, Member, IEEE, Servando Espejo, Rafael Domhguez-Castro,  
Jose L. Huertas, Senior Member, IEEE, and E. Shnchez-Sinencio, Fellow, IEEE 
Abstruct- This paper presents a unified, comprehensive ap- 
proach to the design of continuous-time (CT) and discrete-time 
(DT) cellular neural networks (CNN) using CMOS current-mode 
analog techniques. The net input signals are currents instead 
of voltages as presented in previous approaches, thus avoiding 
the need for current-to-voltage dedicated interfaces in image 
processing tasks with photosensor devices. Outputs may be either 
currents or voltages. Cell design relies on exploitation of current 
mirror properties for the efficient implementation of both linear 
and nonlinear analog operators. These cells are simpler and 
easier to design than those found in previously reported CT 
and DT-CNN devices. Basic design issues are covered, together 
with discussions on the influence of nonidealities and advanced 
circuit design issues as well as design for manufacturability 
considerations associated with statistical analysis. Three proto- 
types have been designed for l.6-pm n-well CMOS technologies. 
One is discrete-time and can be reconfigured via local logic for 
noise removal, feature extraction (borders and edges), shadow 
detection, hole filling, and connected component detection (CCD) 
on a rectangular grid with unity neighborhood radius. The other 
two prototypes are continuous-time and fixed template: one for 
CCD and other for noise removal. Experimental results are given 
illustrating performance of these prototypes. 
I. INTRODUCTION 
ELLULAR neural networks (CNN’s) [ I ]  consist of ar- C rays of elementary processing units (cells), each one 
connected only to a set of adjacent cells (neighbors), and 
exhibiting potential application in different image process- 
ing tasks [ 2 ] ,  pattern recognition 131, motion detection [4], 
etc. The local connectivity property makes CNN’s routing 
easy, allowing increased cell density per silicon area and 
making these computation paradigms very suitable for VLSI 
implementation. This is more pertinent for the important 
class of translationally invariant CNN’s, where all inner cells 
are identical, and layout is very regular. Also, since the 
number of different weights is very small for this CNN class, 
programmability issues can be easily incorporated without 
significant extra routing cost, by just adding several global 
control lines, one per weight. 
ManuscGpt received July 28, 1992; revised November 11 ,  1992. 
A. Rodriguez-VBzquez, S. Espejo, R. Dominguez-Castro, and J. L. Huertas 
are with the Spanish Microelectronics Center of the University of Seville, 
41012-Seville, Spain. 
E. Sinchez-Sinencio is with the Department of Electrical Engineering, 
Texas, A&M University, College State, TX 77843. 
IEEE Log Number 9208099. 
CNN properties and applications have been covered in 
different papers [l] ,  [2], [4]-[9]. This paper focuses on CNN 
VLSI implementation, of which little literature is available 
[lo]-[ 121. For implementation purposes, analog CNN’s can 
be classified into continuous-time (CT) [ 2 ]  and discrete-time 
(DT) [12], [ 131 models. Each model type is described by a set 
of nonlinear dynamic equations, one per cell, whose associated 
equilibrium state distribution determines the network compu- 
tational properties. Previously reported CT-CNN IC design 
approaches focused on the implementation of Chua-Yang’s 
CNN cell circuit model [I] requiring capacitors, resistors, in- 
dependent sources, and linear and nonlinear voltage controlled 
current sources. These implementations [ 101, [ 111 use y,-C 
techniques [ 141 and differential-input CMOS transconductors, 
in a way compatible with standard CMOS technologies. Pro- 
posed discrete-time realizations focused on a very similar cell 
circuit model 1121, where analog switches and corresponding 
clock controlling signals are required in addition to previ- 
ous circuit elements. MOSFET-C techniques [15], [ 161 and 
fully differential high-output-impedance op-amps have been 
considered to implement this model [ 121. 
There are some drawbacks to previously reported CNN 
implementation techniques. On one hand, input signals are 
voltages in all cases, while internal signals can either by 
voltages or currents. Since primary output of image sensor 
devices (phototransistors [ 171) is current, the need arises 
to convert these outputs to voltage, thus complicating CNN 
interface design for image processing tasks. On the other 
hand, electrical cell design is not easy because different 
variation ranges for the internal voltages and currents must 
be considered to guarantee a reduced influence of the MOS 
transistor nonlinearities. Finally, operation speed is not opti- 
mum because the combination of internal voltage and current 
signals results in internal high-impedance nodes, and hence, 
large time constants. 
In this paper, a unified approach is presented to implement 
both CT and DT CNN’s using current-mode techniques, where 
all variables are in the form of currents. First, a new class of 
CNN cell models (full range models) is presented that allows 
reduced area and power consumption in VLSI implementation; 
then, the implementation of CNN’s in current domain is 
discussed. Resulting cell complexity is much smaller than 
for previous approaches, for both the CT and the DT cases. 
Also, design is very simple and the speedlpower figures are 
1057-7130/93$03,00 0 1993 IEEE 
Authorized licensed use limited to: Universidad de Sevilla. Downloaded on April 13,2020 at 14:23:30 UTC from IEEE Xplore.  Restrictions apply. 
RODRfGUEZ-VAZQUEZ et U/ . :  TECHNIQUES FOR CNN'\  I33 
very good due to the nonexistence of internal high-impedance 
nodes. The full range models are covered in Section 11, 
together with a brief summary of the CNN terminology, 
intended to make the paper somewhat self-contained. Section 
I11 presents the basic building blocks and the cell schematics 
for current mode CNN's; abstract circuit elements are used 
to achieve a technology independent presentation. In Section 
IV CMOS design issues are covered, including discussions on 
programmability, as well as on systematic and random error 
sources. Experimental results are given in Section V. 
11. CNN MATHEMATICAL MODELS 
CNN's are arrays of locally interconnected processing units 
(cells), arranged in the more general case on three-dimensional 
grids of arbitrary shape. Each generic CNN cell has three 
associated variables, namely: 
Cell state: zc ( t )  , which conveys cell energy information 
as a function of time. 
Cell output: y'(t). obtained from the cell state via a 
soft-limiter nonlinear device. In the ideal model case [ I ]  
nonlinearity is piecewise linear, 
though smoother sigmoidal approximations with unity slope 
at the origin are also valid. 
Cell input: U';  representing external excitation. 
Each cell in a CNN is excited by the outputs of a set of 
nearby cells located within a distance T in the grid: the cell 7'- 
neighborhood, NT( c) .  which includes cell c itself. Throughout 
the paper we will only consider uniform CNN's, where all 
inner cells are identical. 
CNN's are signal processing units. Input signal information 
is conveyed by the initial state vector z(0) = {x'(O). Vc E 
GV} and the vector zc = {U' Vc E G'D} , where GV denotes 
the grid domain. Output signal information is conveyed by 
the output vector y = {y'>Vc E GD}.  Input-output mapping 
is determined by the net convergence to constant equilibrium 
states following the transient initialized by z(0)  and driven 
by U. In the rest of the paper, we will assume that internal 
net times are much smaller than the lowest input signal time 
constant, and, hence, will only consider the time variable 
in connection to the study of the internal network transient 
evolution. This evolution, as well as the equilibria encoding 
performed by the net, is related to the cell dynamics, given 
either in the form of a differential equation, for CT CNN's, or 
a finite difference equation, for DT CNN's. 
Let us focus first on CT-CNN's. This paper proposes a 
model that differs from the Chua-Yang original one [l] .  The 
new model is given by 
dz' 
dt 
7 -  = g[zc(t)] + D e  + { A i y d ( t )  + B ~ u ~ }  VC E GV 
( 2 )  
& ( c )  
where summations extend over the cell neighborhood, NT( c ) .  
and g(.)  is defined as follows: 
-",' + 1) + I. :IF < -1 
otherwise. (3) y ( d )  = lini -xc.  { - 7 n ( T C  - 1) - 1, zc > 1 m - x  
Input (Eli) and output ( A i )  weights in (2) are called control 
and feedback parameters, respectively, and D' is the offset 
parameter. These parameters determine the input-output map- 
ping performed by the net. Control and feedback parameters 
are, commonly, arranged into matrices (templates), which, for 
uniform CNN's, are the same for all cells. Templates for 
different processing tasks can be found elsewhere [2], [7]-[9]. 
The Appendix shows, on a table, the templates considered in  
this paper. 
By making m = 1 in (3 ) ,  (2) reduces to the Chua-Yang 
model [ I ] .  A significant difference between both models 
concerns the variation range for the cell variables. For both 
models, output variables change inside the [-l! 11 real axis 
interval. Value 1 corresponds to black in the pixel associated to 
the cell, while -1 corresponds to white.' On the other hand, 
the net input signals must fulfill, also for both models, the 
following normalized constraints: 
(4) 
meaning that they have the same variation range that the out- 
puts. Models differ in the range interval for the state variables. 
In the Chua-Yang model this range is larger, bounded by, 
l:I~'(o)l 5 1 . V C  E GD: I,/L('I 5 1,Vc E GD 
S = 1 + ID'I + {IASI + IBiI} ( 5 )  
dE.Y, (f. j 
which, for typical templates, is comprised between 5 and 
IO-much larger than 1 .  On the contrary, in the proposed 
model, which we call full range model, state variables have 
the same variation range than inputs and outputs, [-Ii 11. This 
is very convenient to simplify the design process and to reduce 
area and power consumption of CNN IC's. 
As for the Chua-Yang model [l] ,  the computational prop- 
erties of the full range model rely on its ability to yield, 
for A: > 1. two stable equilibrium points separated by an 
instability region, and in the possibility of modifying the 
attraction regions of the stable points by changing the cell 
state-variable independent term in (2), 
I = D' + B::v" + { A i y d  + Biud}. (6) 
d t S ,  (cj 
d j c  
Cell convergence is illustrated in Fig. 1, representing the 
~(d~'/dl) versus z' characteristics, according to (2). Outer 
pieces have slope rrL + x. while slope of the inner piece is 
A: - I .  Parameter I is given by (6) and assumed constant 
for each trace in Fig. I .  Five different qualitative cases arise 
depending on the I values. As illustrated in Fig. 1, dynamic 
routes converge for all cases to the outer equilibria states, 
where 1 yc 1 = I. A detailed discussion on the proposed model 
' T h a e  are normalized valucs which in actual circuits correspond to biasing 
signals. 
Authorized licensed use limited to: Universidad de Sevilla. Downloaded on April 13,2020 at 14:23:30 UTC from IEEE Xplore.  Restrictions apply. 





Dynamic routes for the full range CT CNN model: I, < I, < 0 < I., 
< 4 .  
Fig. 2. Dynamic routes for the DT CNN model: 11 < I L  < 0 < I , ,  < 1 4 .  
stability properties is out of this paper's scope. However, the 
model has given correct results, qualitatively similar to those 
of the Chua-Yang model, for all the templates covered in this 
paper. 
For sampled-data implementations, ( 2 )  must be emulated by 
some explicit integration algorithm. This involves a trade-off 
in choosing T / r  (where T denotes the sampling period): to 
increase speed, T / r  must be as large as possible; for stability, 
T / r  should be less than 2 1131. We will use the heuristic guess 
of T = r, which produced correct steady states with all the 
templates we tried. Also, this choice allows writing the cell 
dynamic equation as follows: 
y c ( n + l )  = f D' + {A;yd (n )  + B ~ I L ' ~ }  Vc E GV 
(7) 
where to ensure convergence, we require that f ( . )  be 
sigmoidal-like and that A:f'(O) be larger than 1 .  This can 
be achieved by using either a soft nonlinearify such as that 
in (1) and A: > 1, or by using a hard nonlinearih (central 
slope infinitely large), and A: = 1. as proposed in 1121. AS 
for the full range CT model, all variables involved in the 
implementation of (7) have the same variation range, [-1. 
11. Fig. 2 illustrates convergence towards binary outputs for 
the DT-CNN model. Same qualitative cases as for Fig. 1 are 
considered, corresponding to different values of I in (6). 
[ 11E N, ( c )  I 
(b)  
diagram. 
Fig. 3. (a) CT CNN conceptual cell diagram. (b) DT CNN conceptual cell 
nonlinear block is eliminated and the function g ( . )  realized 
exploiting the output saturation of the integrator block. Fig. 
3(b) shows a conceptual analog computation cell diagram for 
a full-range DT-CNN cell, which requires a delay instead of 
an integrator. Note that no weighting is performed on the 
neighbors' contributions at the cell input, as would correspond 
to (2) and (7). Instead, each cell produces different weighted 
output for each neighbor. To handle this, implementation 
templates must be obtained by interchanging entries along all 
radial lines in the original template matrices, as illustrated 
below, 
[; g 7.1 4 [ ;. :; :] (8) 
g 11, i c b n. 
Signal summation, scaled replication, integration, delay, and 
nonlinear transformation are the analog operators required 
to implement CNN's. Summations are very easily performed 
in current-mode by routing currents to a common mode. 
Remaining operators are realized using a very simple analog 
building block: the current mirror, which yields linear cur- 
rents scaling via the functional cancellation of nonlinearities 
between matched transconductors. 
3.1. Background and Current-Mode CNN Static Operators 
Current mirror concept and applications are discussed using 
a generic three terminal transconductor, represented by Fig. 
4(a), which we assume characterized as follows: 
111. BASIC BUILDING BLOCKS FOR CURRENT-MODE C"'S ,;* = PU(111. 112) 
il EO Fig. 3(a) is an analog computer conceptual block diagram (9) 
for a CT-CNN cell where circles indicate summation. triangles 
are used for signal scaling, and the other two blocks represent 
integration and nonlinear transformation. This diagram corre- 
sponds to the Chua-Yang model. For the full range model the 
where ,U(.) is assumed invertible, at least in ,111. and the 
characteristic is parameterized by a designer-controlled scale 
factor I'. 
Authorized licensed use limited to: Universidad de Sevilla. Downloaded on April 13,2020 at 14:23:30 UTC from IEEE Xplore.  Restrictions apply. 
RODRIGUEZ-VAZQUEZ ('1 (11 TECHNIQUE? FOR c'vz'\ 13s 
(e) 
Fig. 4. (a) Generic transconductor. (b )  Single MOST implementation. ( c )  
Single BJT implementation. (d) Cascode MOS implemenralion. ( e )  Tunable 
implementation. 
The generic transconductor of Fig. 4(a) may consist of either 
a single transistor or a more complex device containing several 
transistors. Fig. 4(b) and (c) shows the single MOS and BJT 
transconductors. Assuming operation after pinch-@ in strong 
inversion for the MOS, and in active forward region for the 
BJT, (9) particularizes as 
+ o  
"R 
.. . ....... ..... 
le) 
Fig. 5 .  Static current-mode analog operators. (a) Scaled replication. (b) 
Bilateral current mirror using bias shifting. (c) Bilateral current mirror 
using complementary devices. (d) Current reversing. (e) Nonlinear current 
transformation. 
other MOS transconductor altematives. We find convenient to 
use this abstract transconductor to make current mode analog 
operators development independent on both technology and 
transconductor topology. 
Fig. 5(a), where each transconductor has a different param- 
eter value, P, (0 5 i < N ) .  illustrates the basic current mirror 
function. Input device is feedback, while output devices are in 
open loop. Assuming that transconductors are matched, that 
their outputs are equipotential, and their input currents are 
negligible (the last holds exactly at DC for MOST), it is seen 
that output device nonlinearities cancel out (one by one) the 
nonlinearity of the input device, yielding 
where the designer controls /I = ( K / 2 ) ( W / L ) .  and Is = 
I ~ ~ A E ,  by changing the channel width and length for the 
MOS, or the emitter area for the BJT.2 Fig, 4(d) and (e) shows 
I< E pCVL is the normalized MOS large signal transconductance and 1 
is the MOS threshold voltage; C7,  is the t h a " l  voltage and I., denotes the 
early voltage. 1:4 is controlled in MOS by the channel length [ 181. 1191. 
where Pk/Po can be controlled by the designer, yielding the 
scaled replication operation. 
Most elementary transconductors enter a cut-offregion, with 
i? = 0 in Fig. 4(a), when input voltages are below a our-in 
value. This is the CaSe for all devices in Fig. 4. It means that 
only positive currents are possible at the outputs in Fig. 5(a), 
Authorized licensed use limited to: Universidad de Sevilla. Downloaded on April 13,2020 at 14:23:30 UTC from IEEE Xplore.  Restrictions apply. 
I36 IEEE TRANSACTIONS ON CIRCUITS AYD SYSTEMS-I1 ANALOG AND DIGITAL SIGNAL PROCESSING. VOL 40, NO 3. MARCH 1993 
and, hence, that the implemented scaled replication operation 
is unilateral. For bilateral operation, either current shifting 
biasing at the input and output nodes or complementary 
devices must be used, as shown in Fig. 5(b) and (c), where an 
arrow has been added to the device symbols to differentiate 
complementary devices. 
Circuits in Fig. 5(a)-(c) perform as inverting current ampli- 
fiers (assuming that input currents are positive when entering a 
node, while output currents are positive when leaving a node), 
allowing only negative weights. Noninverting amplification 
(this is, positive weights) are achieved by cascading two bilat- 
eral mirrors, as illustrated in Fig. 5(d) for unity weight. This 
circuit is also used to implement the saturation nonlinearity 
required for CNN’s. It is achieved by using transconductor cut- 
off: in Fig. 5(d), for iin < --lg13 the first mirror cuts off and 
current i,l is supplied only by the rail, making i,l = - 1 ~ 1 ;  in 
a similar way, the cut-off of the second mirror makes io = IBZ  
for i;, > 1 ~ 2 .  In this manner the saturation characteristic of 
Fig. 5(e) is implemented. 
3.2. Current Mode Dynamic Operators 
Dynamic operators required for CNN’s (both CT and DT) 
are also easily implemented by current mirrors. To understand 
these operators it is convenient to consider the simplified 
generic transconductor small-signal model shown in Fig. 6(a), 
where, commonly, it is y, >> yo >> gin. On the other 
hand, dominant reactive transconductor parasitic is typically 
capacitive and located at the input node, as assumed in the 
model of Fig. 6(a). Thus, the current mirror dynamic behavior 
is characterized by a finite time constant T = 2C;,/g, what is 
used for the implementation of a current mode lossy integrator. 
A schematic is shown in Fig. 6(b), where we have included 
the lossy integrator by itself (part enclosed in dashed lines) 




: T U 
I = I  I =;=I 
A capacitor has been added at the integrator input for more 
accurate control of the input time constant and to make it 
dominant as compared to the parasitic time constant at the 
output node. For analysis of Fig. 6(b) we assume a two pieces 
the other for the conducting region), obtaining 
Q, J - u u - L J  
0 2  m 
PL transconductor model (one piece for the cut-off region and n n + l  
(d) 
Fig. 6. Dynamic current mode CNN operators. (a) Small signal model of 
a generic transconductor. (b) Continuous time current mode lossy integrator. 
(c) Half-clock current mode delay. (d) Current mode full-clock delay. 
di 
( 2)  (2)  To = iin + IBg d t  
where g( . )  is the nonlinearity proposed in the full range 
model, given in (3). By comparing (12) to (2) it is seen that 
the current mode lossy integrator saturation can be exploited 
for the implementation of the proposed full range CT CNN 
model, thereby greatly simplifying design. Nonidealities of 
the current-mode lossy integrator and their influence on the 
net operation will be discussed in Section 4.2. 
Fig. 6(c) shows a circuit to achieve a delay in the propaga- 
tion of a current, using a mirror and a clocked analog switch 
[20]. The parasitic input capacitor of the output transconductor 
is charged while the clock signal is high. When the clock signal 
becomes low, this charge is held at this capacitor and a half 
‘In actual circuits i;, cannot be smaller than -1B. since it is limited by 
the device input bias current. 
clock period delay is implemented: 
This concept is very well suited for technologies contain- 
ing MOS transistors (NMOS, CMOS, BiCMOS), due to the 
availability of zero-offset low-leakage analog switches and 
high impedance transconductors. As for the lossy integrator, 
nonidealities of the delay block will be covered in Section 4.2. 
Fig. 6(c) is a halfdelay block: sampled output currents are 
available during other clock phases than that during which 
input currents must be sampled. A full delay is achieved by 
cascading two identical half delay blocks and using nonover- 
lapping clock signals, as shown in Fig. 6(d). Taking into 
Authorized licensed use limited to: Universidad de Sevilla. Downloaded on April 13,2020 at 14:23:30 UTC from IEEE Xplore.  Restrictions apply. 
RODRfGUEZ-VAZQUEZ et al.: TECHNIQUES FOR CNN' j  137 
account transconductor cut-off, analysis of this circuit yields 
Thus, since delay and saturation are achieved in the same 
device, the nonlinear block in Fig. 3(b) can be eliminated, 
similar to what occurs for the full range CT CNN model. 
3.3. Current Mode CNN Conceptual Cell Architecture 
The first step towards CNN IC design is to define the 
normalization factor for the output variables range. We assume 
the range is symmetric around the origin, and the rail is 
IQ,  so that yc E [ - I Q . ~ Q ] .  For CT Chua-Yang CNN's, 
a different, wider range must be considered for the state 
variables, zc E [-SI,. S IQ] .  where S is given in ( 5 ) .  
Fig. 7(a) shows a generic current mode CNN cell which 
applies for all models in the paper. The dynamic part in 
this figure differs depending on the actual model considered 
(Fig. 7(b)-(d)). Fig. 7(a) shows only the part of the cell 
corresponding to the evaluation of the cell state (.F) and the 
generation of output variable replicas for neighbors (Afy'). 
Generation of scaled replicas of the input cell current (U ' )  
is straightforward using Fig. 5.  Only two outputs per cell are 
included in Fig. 7(a): for positive and negative weights, respec- 
tively. Additional positive (alternatively negative) weighted 
outputs are obtained by using the mirror replication concept 
from voltage w ~ p  (alt. V R , ~ ) .  Switches labelled St and 3, are 
used for cell initialization purposes. 
Fig. 7(b) is a schematic of the cell's dynamic part for 
the Chua-Yang CT model. Corresponding schematic for the 
full range CT model is shown in Fig. 7(c). Switch R, in 
both schematics is used for initialization purposes. It is seen 
that the full range model gives simpler circuits than the 
Chua-Yang model. Fig. 7(d) shows the schematic of the cell's 
dynamic part for the DT CNN model. Bias currents in Figs. 
7(a)-(d) are obtained by replication of a master current using 
complementary devices, as shown in Fig. 7(e). 
In the simplest case, a generic transconductor contains only 
one transistor. Besides, only one MOS transistor is required 
to implement an analog switch with zero ON offset and very 
low OFF leakage (about I O  PA). This infers that for this 
simplest case, and, for instance, for a CDD, the current-mode 
DT cell uses 20 transistors: this is an important advantage 
when compared to previous approaches for DT CNN's  [12], 
(e) 
Fig. 7. (a) Generic current mode CNN cell. (b) Dynamic block for 
Chua-Yang CT CNN'a. (c) Dynamic block for full range CT CNN's. (d) 
Dynamic block for DT CNN's. (e) Generating cell bias currents from a 
reference current. 
where 106 transistors are required for a similar cell. For CT 
Chua-Yang CNN's, complexity (measured in number of tran- 
sistors) of the current-mode circuit is similar to previous ,yn,-C 
implementations [ 101. However, this complexity decreases in 
current mode implementations of the full range model (down 
to 18 for CCD). 
statistical variations of the technological parameters 
the die (mainly K and V, in CMOS) and can be attenuated 
using large devices [21], careful layout [19], and proper bias 
generation and distribution. Systematic errors are, on the other 
hand, corrected by proper transconductor choice and transistor 
sizing. 
Let us focus on systematic static current mirror errors. Fig. 
8, consisting of a generic mirror and associated driving and 
loading devices is used to discuss these errors. We assume the 
device is nominally intended to yield output to input current 
scaling by a factor I',,/P;,,. Two major error sources can be 
identified: a) Input-output voltage mismatching (vin # vO) 
at the bias point, defined as the point where transconductors 
IV. CMOS CURRENT MODE CNN DESIGN ISSUES 
4.1. CMOS Mirror Schematics: Static 
Nonidealities and SizingEquations 
Current mode CNN operation is degraded by both random 
and systematic sources of error. Random errors are due to 
Authorized licensed use limited to: Universidad de Sevilla. Downloaded on April 13,2020 at 14:23:30 UTC from IEEE Xplore.  Restrictions apply. 
138 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS-11. ANALOG AND DIGITAL SIGNAL PROCESSING, VOL. 40, NO. 3, MARCH 1993 
TABLE I 
SIZING EQUATIONS FOR CMOS MIRRORS 
I 
J, 
Fig. 8. Generic mirror and boundary circuitry. 
sink only their bias currents, (i;, = 0); b) finite R,/R;, 
ratios, where R, and R;, represent the mirror output and input 
resistances. 
Voltage mismatching produces current offset (io # 0 for 
ii, = 0) , originated by the transconductor current dependence 
on the output terminal voltage (see (9)). For the MOS transcon- 
ductors in the paper, this offset is eliminated by forcing the 
same current density in the mirror input transconductor and 
loading device, which in the case of Fig. 8 is achieved by 
specifying 
L T  
y=-=-. (15) 
PL En 
This relation must hold for all transconductors (both input and 
output devices) in the net. 
Finite R,/Ri, ratio causes current gain error due to spurious 
current division at the mirror input and output nodes. For sim- 
plicity, we shall assume the same output and input resistances 
for all mirrors in Fig. 8 (actually resistances depend on the 
current level) obtaining 
i o  Po 1 
where N is the number of mirrors driving node vi,. It is seen 
that the gain error is inversely proportional to R,/R;,, and 
increases proportionally with N .  A more precise evaluation 
taking into account Ri, and R, variations with current level 
also yields the same proportionality. Since N may be a 
large number (up to 11 for the comers and borders detection 
templates on a rectangular grid net with T = 1), the importance 
of the current gain error cannot be underestimated. The error 
is especially significant if small dimension single-transistor 
transconductors are used, due to the very low Early voltages 
associated to short channel transistors. This can be corrected 
by increasing device size (channel length), but this does not 
yield optimum area and speed for CNN implementations. 
For improved Ro/Ri, figures with short channel devices, 
circuit strategies instead of transistor sizing should be used. 
In particular, the cascode transconductor of Fig. 4(d) provides 
optimum area and speed for given R,/Ri,. 
Fig. 9(a) and (b) show the two CMOS mirror structures con- 
sidered herein for CNN design, including the complementary 
devices used for biasing (separated by dashed lines). Design 
parameters are displayed in the figures: W, L,, L, for the 
simple mirror, and W. L ,  and VCAS for the cascode mirror. 
Fig. 9(c) shows a circuit to provide the cascode voltage, VCAS, 
for the cascode mirror. We assume that Early voltages are 
proportional to the channel length: V,, = (U, L, , V,, = a, L, . 
In Fig. 9(a) length of the NMOS's (LT1)  is different to that of 
the PMOS's (L,). which is intended to obtain equal nominal 
Early voltages, and, hence, optimize Ro/R;,. In the case of 
Fig. 9(b), R,/R,, is intrinsically much larger and all lengths 
are made equal (L ,  = Lp) .  For further simplicity, we have 
also assumed that all transistors have the same channel width, 
W. 
Table I gives sizing equations for Fig. 9(a) and (b). These 
equations are intended to ensure that the mirrors handle the 
whole input current range with minimum distortion, and using 
the smallest possible devices. The W expressions given in 
the table correspond to a bias current IQ; W values for larger 
currents are calculated taking into account the requirement for 
equal current density in all transconductors, given in (15). 
Note that the sizing equations are parameterized by L,, 
which is chosen by the designer to control R,/Ri, and the 
channel area. For Fig. 9(a), evaluation of these figures results 
in 
I 
while, for the cascode mirror of Fig. 9(b), the following is 
obtained: 
Fig. 10 shows the current gain error per cell versus the 
total cell area for a full range CT connected component 
detector CNN using single and cascode mirrors. Technology is 
a standard digital n-well 1.6-pm CMOS. Scale is logarithmic. 
Two families of curves are shown: the top family is for the 
single transistor mirror, and the bottom family for the cascode. 
Parameter for each family is the rail current IQ,  which varies 
Authorized licensed use limited to: Universidad de Sevilla. Downloaded on April 13,2020 at 14:23:30 UTC from IEEE Xplore.  Restrictions apply. 
RODRIGUEZ-VAZQUEZ et al.: TECHNIQUES FOR CNN'S 139 







:d mirrors and biasing devices. (a) Simc mirror an as 
reference circuitry. (b) Cascode mirror and reference circuitry. (c) Cascode 
voltage generation. 
from 0.25 p A  (bottom curve in each family) to 128 p A  (top 
curves). As it can be seen, simple current mirror requires large 
area to achieve an acceptable error figure. On the other hand, 
cascode mirrors allow the use of short channel devices, and, 
thus result in much higher area efficiency and speed. 
4.2. CMOS Current Mode Dynamic Operators Nonidealities 
For illustration purposes, let us consider the current mode 
Chua-Yang model. Large signal analysis of Fig. 7(a) cell core 
(consisting of the dynamic block and self-feedback term) for 
MOS transconductors yields 
which is valid for -SI,  < xc < SI,, for the simple and 
cascode mirrors, and where p = K ( W / L ) ,  and I denotes the 
part of the total cell input current which is not contributed by 
ye (see (6)). It is seen that the time constant 
. - .  -050 ~ 
- 
t -350 - nu 3 -400 






3 6 0  3 8 0  400 4 20 440 4 60 4 8 0  500 
L O ~ ( A K + ( ~ ~ ) )  
Fig. IO. Current gain relative error (per cell) versus cell area in a full range 
current mode CT CCD CNN cell, for different values of the rail bias current. 
I Q ( ~ A )  = 0.25, 0.5, 1, 2, 4, 8, 16, 32, 64, 128. Top family: Single mirror. 
Bottom family: Cascode mirror. 
depends nonlinearly on the state variable value: transient is 
faster is the cell state is near the black pixel. However, the 
equilibrium points are exactly the same as for the nominal 
case, corresponding to the solution of the equation 
h" = -xC + A : I Q ~  - + I = 0. (21) (3 
Also, the sign of the state variable derivative (equivalently, the 
sign of h") for a given I has exactly the same dependence on 
zc as for the nominal case, and, as a consequence, dynamic 
evolution can be expected to coincide to what is nominally 
expected. 
Now let us consider nonidealities appearing in the CMOS 
current mode delay block of Fig. 6(c). Assume the analog 
switch consists of a single MOS transistor, as shown in Fig. 
ll(a), where the capacitor can consist only of parasitics. 
Correct operation of this circuit relies on the capacitor ability 
to hold the input when the analog switch turns off. Together 
with the unavoidable leakage current (negligible at usual 
clock frequencies), error arises mainly due to the necessity to 
evaluate the MOS channel charge during the switch turn-off 
process, giving 
where Ay represents the part of the evacuated charge de- 
livered to the capacitor node. This error, generically called 
feedthrough error [22], produces a large current error (up 
to 20% of the bias current, and more) if small geometries 
transconductors are used. This effect can be attenuated with 
several techniques. A simple choice is the inclusion of an 
additional capacitor at node 'u2 in Fig. ll(a). Since neither 
linearity nor accuracy in the capacitance is required for this 
purpose, a shorted transistor can be used, represented by &I 
in Fig. l l(b).  This is feasible for standard digital CMOS 
technologies (having only one poly layer) and require less area 
Authorized licensed use limited to: Universidad de Sevilla. Downloaded on April 13,2020 at 14:23:30 UTC from IEEE Xplore.  Restrictions apply. 
140 IEEE TRANSACTIONS ON ClRCClTS AND SYSTEMS-11: ANALOG AND DIGITAL SIGNAL PROCESSING. VOL. 40, NO. 3, MARCH 1993 
ON OFF ON OFF 





Fig. 11. (a) Storing a voltage via a simple switch. (b) Circuit strategy for 
feedthrough attenuation. (c) Feedthrough error versus realdummy delay time. 
than typical capacitors in two poly technologies (Poly 1 -SiOz- 
Poly2), since channel oxide is usually thinner than interpoly 
oxide. The use of this technique can easily lower the error 
about one order of magnitude. Also, since only two of these 
devices are required in each CNN cell, area penalization is 
not severe. Much lower feedthrough error is achieved using 
a dummy transistor, represented by Q 2  in Fig. 1 I(b), and 
automatically tuning the delay between the switching device 
(transistor Q) clock signal and the dummy device clock signal 
[23]. In this manner errors as little as 0.3% have been measured 
on silicon prototypes. Also, the extra monitoring and control 
circuitry can be shared for all cells in the network, so that area 
penalty is not severe at the network level. 
Note that feedthrough is also important in the CNN initial- 
ization process. Thus, this effect must also be considered for 
the design of CT CNN models. 
4.3. Bias Current Selection: Areu, Power and Reliability 
A crucial issue that has not been discussed yet is the 
election of the bias current IQ. Since the transistor geometry 
factors (see Table I), the static gain error due to finite R,/R;, 
values, and the power dissipation, increase with IQ: a bias 
current as small as possible should be chosen. The issue is to 
identify the minimum feasible rail current value. Lowest limit 
is certainly established by leakage (about 10 pA in standard 
CMOS). However, a more restrictive bound exists due to 
MOS transistor mismatch [21], [24] and Early voltage ( V A )  
degradation with channel length. 
Mismatch is mainly produced by variations of the thresh- 
old voltages (V?)  and large signal transconductance ([) = 
KIV/L)  of equally designed transistors in the same chip. 
Standard deviations for these parameters have two major 
components: one inversely proportional to the square root of 
the channel area, and other proportional to the device distance. 
Results in [ 2  1 ] demonstrate that the distance-dependent com- 
ponent is negligible for devices with a channel area less than 
about 100 ~ I U ~ .  Since for bias currents below about 50 / L A  and 
cascode mirrors, device area calculated from Table I is well 
below this bound, the distance dependent component need not 
to be considered for current mode CNN's. 
Another important consideration, for a given ~ ( V T )  and 
n ( / j ) / / j .  the ratio a ( I ) / I  in MOS transistors is inversely 
proportional to the gate-source voltage V ~ ~ ( , L J ~ )  in (10)). This 
means that, once W / L  factors have been set to achieve 
acceptable mismatch levels, bias current can not be decreased 
too far below the upper bound given by equations in Table 
I, since this would produce a low ligS voltage at the bias 
point, with the corresponding high n ( I ) / I .  Hence, mismatch 
considerations establish bounds for both minimum area and 
power trends. 
For example, we have obtained 100% success (out of 30 
trials) for a Monte Carlo simulation of a connected component 
detector (CCD) current mode full range CT CNN with 16 cells 
in a row4 (CCD template is one-dimensional, and hence there 
is no need to simulate a two-dimensional system). Unitary 
transistor geometries of I/t'/L = 3 pi11/3.2 prn for both n and 
p-channel devices, and a bias current IQ = 2 p A  were used. 
Further geometry reduction has not been considered, since 
minimum contact size (4 pin with surrounding diffusion in the 
1.6 prii n-well technology used) does not allow a significant 
area reduction anyway. 
Similar simulations performed on a Chua-Yang model coun- 
terpart using devices with the same geometries resulted in 
lower yield figures. This may be clarified by Fig. 12, where 
the nonlinear characteristics seen by the integrating capacitor 
are shown for different Monte Carlo trials. It is seen that 
normalized dispersion value of the equilibrium points (nor- 
malization factor is the distance between nominal equilibrium 
state positions) is much smaller for Fig. 12(a), corresponding 
to the full range model, than for Fig. 12(b), corresponding 
to the Chua-Yang model. Hence, larger geometries should be 
used for increased yield with this last model. 
In the previous Monte Carlo simulations, global biasing 
voltages (bias stage is also simulated) are used for current 
reference generation. Dispersion due to mismatch among tran- 
sistors of different current sources did not produce critical 
results. Thus, global biasing is a fair approach. Nevertheless, 
when high noise levels are expected at bias voltages, as may 
be the case in DT implementations, an independent current 
reference [25], shown in Fig. 13, can be included in each cell. 
Monte Carlo simulation of an entire CCD system in which 
'All simulation results referred to in  the paper correspond to full  device 
level simulation on schematics extracted from the net layout and using level 
2 transistor models with parameters provided by the foundry. The technology 
used i \  a standard direct wafer writing 5 V 1.6 I ' m  n-well 2-metal I-poly 
CMOS. 
Authorized licensed use limited to: Universidad de Sevilla. Downloaded on April 13,2020 at 14:23:30 UTC from IEEE Xplore.  Restrictions apply. 
RODRIGUEZ-VAZQUEZ et a/.:  TECHNIQUES FOR CNN'S 141 
V d V )  
(b) 
Fig. 12. Different Monte Carlo trials for the I-V characteristic at the 
capacitor of a CT current mode integrator in a CCD CNN cell. (a) Full range 
model. (b) Chua-Yang model. 
Fig. 13. Bias and threshold voltage independent current reference 
[Greg86]. 
cell current references have a standard deviation larger than 
5% have shown 100% success. 
4.4. Input-Output Strategy 
Efficient input-output (VO) interfacing is the strongest ob- 
stacle limiting usability and testability of CNN chips. To make 
design feasible for medium complexity circuits, I/O strategies 
other than parallel cell loading and/or downloading through 
bonding pads must be considered. 
When input data is in optical form, using CMOS compatible 
photosensor devices [ 171 allows full parallel cell loading with 
no pinage cost. Output signals provided by photosensors are 
in the form of current, and thus easily connected with the 
proposed current-mode cells. Also, since in most application 
cases either ~ ' ( 0 )  or uc do not convey signal information, only 
one photosensor per cell is required. 
Regardless of whether the initialization process is parallel 
or serial, some control circuitry must be included inside each 
cell to isolate this process from the net computation process. 
In the fully parallel input case this is handled very easily by 
using one global signal, St .  to control the switches in Fig. 7. 
The switch labelled Rc in the CT dynamic blocks (Fig. 7(b) 
and (c)) is not required in this parallel case (switch must be 
shorted). For the DT cell (Fig. 7(c)), clock signals must be 
disabled, and kept at a high state, while St is high. 
Serial cell loading requires more involved control circuitry: 
local logic must be included in each cell, and additional control 
signals must be employed. Local logic can be implemented by 
serial/parallel switches, to avoid noise coupling from switching 
digital gates. Serial loading confronts also the designer with 
important electrical issues related to the need to maintain 
each cell state and input, while remaining cells are initialized. 
To reduce errors due to MOS channel charge injection, the 
attenuation techniques discussed in Section 4.2 can be used. 
Net downloading processes must be performed serially in 
the more general case, and hence, local logic and control 
signals are also required. However, this additional circuitry 
can be basically the same as that used for initialization. Since 
downloading can be performed while the network remains in 
operation (with the help of an additional output replication 
branch), leakage and charge injection errors are not of concern 
in this case. 
For instance, Fig. 14(a) and (b) show the cell schematic and 
layout of a CT full range CCD cell for parallel loading and 
downloading, including I/O circuitry. As way of example, Fig. 
15 depicts a high level diagram of a CNN chip with a cell-by- 
cell loading and downloading strategy. Only 8 external pins 
are required (6 if digital and analog supplies share the same 
pin). 
If input signals are binary, current mode CNN chips can 
be tested with digital equipment. For this purpose, input 
and output signals must be voltages. Fig. 16(a) shows a 
simple binary V-to-1 converter, which is used to interface 
test equipment with network input. This circuit is also used 
for offset terms and border cell contributions. Output I-to- 
V transformation can be done either using a simple CMOS 
inverter, or the faster current comparator of Fig. 16(b) [26]. 
4.5. Area Evaluation fo r  Current Mode CNN's 
Table I1 gives the transistor count and total cell area for the 
different templates in Appendix A, and for both CT models. 
Cascode mirrors with W = 4 /mi and L3.2 /mi are used (bear 
in mind that for these dimensions yield of the implementations 
based on the Chua-Yang model is lower than that for the 
full range model). For DT implementations area is slightly 
larger than those for the full range CT model, due to the 
switches. Network layout is simplified by including in each cell 
the global lines for power supplies, biasing, control and data 
path. These lines may use metal-2 over the cell area, which 
allows a significant increase in pixel density. Connections with 
neighbor cells can also be laid-out in such a way that cells 
interconnect by abutment. This eliminates tedious routing at 
the network layout level. 
Authorized licensed use limited to: Universidad de Sevilla. Downloaded on April 13,2020 at 14:23:30 UTC from IEEE Xplore.  Restrictions apply. 
142 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS-11: ANALOG AND DIGITAL SIGNAL PROCESSING, VOL. 40, NO. 3, MARCH 1993 
ST OP IN VCAS VPBL VSSA VPBH VDDA 
- 
ST ST IN ON VCAS vpnL VSSA VPBH VDDA 
(a) 
4 111 6 p n  
(b) 
inghnloading. (b) Layout of the cell. 
Fig. 14. (a) Complete schematic of a CT FR CCD cell for parallel load 
V D D D  VSSD 
7 0 
CK 2P Module Binary Counrer 0 
P:2' Decodrr I P.2' Decoder 
RST 
I I 








Fig. 15. Architecture of a serial 1/0 CNN chip with required external 
connections. 
Although data in Table I1 do not include the area occu- 
pied by the initialization circuitry, approximate pixel-densities 
ranging from 60 to more than 160 cells/mm2 can be easily 
achieved (depending on the particular template) if the full 
range model is used. 
4.6. Programmability Issues for Current-Mode Blocks 
Although fixed weight CNN chips can be useful as stand- 
alone units for dedicated image processing tasks, programma- 
bility is an important system level feature for general purpose 
CNN IC's. In current-mode domain two different approaches 
to programmability can be considered depending on the nature 
of the associated controlling signals: discrete programmability, 
IQ' Qb= "1"  
- I Q ,  Qh= "0" 
Fig. 16. (a) One hit voltage to current converter with &IQ output. (b) High 
resolution current comparator [Domi92]. 
TABLE I1 
CELL AREA AND TRANSISTOR COUNT FOR 
DIFFERENT TEMPLATES AND CNN MODELS 
Full Chua-Yang Full Chua-Yang 
Dynamic Model Dynamic Model 
Range Area Range Ttor. Count 
Area (w2 1 Ttor. Count 
(WL 1 
C.C. 5916 I269 I 40 56 
Detec. 
Shadow 1136 16533 52 68 
Detec. 
Borders 1547 1 2629 1 I12 128 
Extrac. 
Corners 1638 1 28212 120 136 
Extrac. 
Hole 1092 1 24714 76 92 
Filling 
Noise 5460 14258 40 56 
Filtering 
where controlling signals are digital, and continuous pro- 
grammability, where controlling signals are analog. Discrete 
programmability can be incorporated in a very simple way, 
by analog multiplexing of current contributions from different 
mirrors. These mirrors can either implement fixed templates 
(with application, for instance, in cases where well-defined 
tasks must be sequentially performed, as in [27] and [28]), 
or be binary-weighted (for more general application). Discrete 
programmability provides ease of controllability and accurate 
results, at the cost of strong area penalty. 
For reduced area and continuous weight adjustment, analog 
programmability should be considered. A simple way to 
achieve analog programmability is using tunable transcon- 
ductors, as the one shown in Fig. 4(e). Fig. 17 shows a 
programmable current mirror using this transconductor. Two 
different situations arise depending on whether transistors 
operate in weak or in strong inversion. Analysis for both 
Authorized licensed use limited to: Universidad de Sevilla. Downloaded on April 13,2020 at 14:23:30 UTC from IEEE Xplore.  Restrictions apply. 
RODRfGUEZ-VAZQUEZ et al.: TECHNIQUES FOR CNN's 143 
7- 
Fig. 17. Programmable current mirror using tunable transconductors. 
operating conditions shows the following: 
As can be seen, the dependence is linear for weak inver- 
sion; hence, this latter case provides larger weight adjustment 
ranges. It is illustrated in Fig. 18, showing the current weight 
as a function of I B 2 / I B  for different values of I B l / I B ,  
where I, is a normalization factor of value 10 nA for weak 
inversion (Fig. 18(a)) and 50 pA for strong inversion. Also, 
nonlinearity cancellation is exact in weak inversion due to 
the exponential nature of current to voltage characteristics, 
while it is only approximate for strong inversion: nonlinearity 
in the weak inversion case is less than 1% up to ,i0/IB2, 
while the corresponding value for strong inversion is io = 
0.131~2.  Drawbacks of weak inversion are low accuracy, due 
to mismatch, and reduced speed. These can be overcome by 
using CMOS compatible lateral BJT's [29], which exhibit ex- 
ponential feature for larger current ranges, and with excellent 
matching properties [30]. Other alternatives for tunable CMOS 
current mirrors are found elsewhere [31], [32]. 
V. PRACTICAL RESULTS 
Several prototypes have been designed in a 1.6 pm 2-metal 
I-poly n-well CMOS technology (within the EUROCHIP 
framework). One corresponds to a reconfigurable 9 x 9 DT 
CNN prototype which is conceived as a general purpose vehi- 
cle for CNN chip demonstration and test. Reconfigurability is 
achieved via local logic, allowing the circuit to implement the 
templates for noise removal, feature extraction (borders and 
edges), shadow detection, hole filling, and CCD, on a rectan- 
gular grid with T = 1. Rail current used in the prototype is 10 
PA. Switches are implemented using minimum dimension n- 
channel transistors. Dummy transistors, also with minimum 
dimension, are used for adaptive feedthrough cancellation 
[23]. No feedthrough cancellation capacitor is used. Clock 
frequency is 5 Mhz. Serial loadingldownloading process on 
a cell by cell basis is used. Due to the incorporation of 
programmability and reconfigurability issues, as well as other 
issues related to manufacturability evaluation (for instance, 
biasing currents can be generated either locally, using a bias 
circuit per cell, or globally, using a master bias), cell dimension 
for this prototype is very large: 500 prn x 500 prn. 
Fig. 19 shows a measurement of the cell nonlinearity for 
this prototype. Inversion in the measured characteristics is due 
to the sign convention for the measurement system (HP4 145B 
semiconductor parameter analyzer). Linearity is excellent (less 
than 1% deviation) over the whole input current range, and 
... 
2 5 0  
Z O O  
1 50 
1 0 0  
0 50 
0 00 
0 00 0 20 040 0 60 0 80 1 0 0  
lB?lB 
(a) 
1 4 0  
000 ' 
0 00 0 20 0 40 0 60 0 80 1 0 0  
18211, 
(b) 
Fig. 18. Weight ( ~ ( , / i , ~ )  variation with I B ~ / I B  for different values of 
I B ~  /IB in Fig. 17. (a) Transconductors operating in weak inversion, IB = 10 




-. c. 00 
7:N 4.000/d>v Cub) 
-20.00 
Fig. 19. Cell nonlinearity measured from a silicon prototype. 
nonlinearity is quite abrupt. Measured R, /R; ,  is 44 x lo3. 
On the other hand, measurements of the delay block shows 
that the proposed feedthrough cancellation technique yields 
errors below 0.3% of the rail current, much better than needed 
for CNN. 
Authorized licensed use limited to: Universidad de Sevilla. Downloaded on April 13,2020 at 14:23:30 UTC from IEEE Xplore.  Restrictions apply. 




5 W  -- 
1 ,I . 
Fig. 20. Empirical results for the DT prototype configured for noise removal: 
Four cases with different input images (a, b, c, d). In each case, top row 
corresponds to numerical simulation, and bottom row to device level 2 
electrical simulation. 
Fig. 20 shows experimental results obtained if the DT 
prototype is configured for noise removal. These are HSPICE 
simulations of layout extracted netlists using level-2 MOST 
models. Four different images are considered. For each image, 
Fig. 20 shows the corresponding ideal and experimental result. 
In this figure different gray levels are assigned to the different 
current levels, according to the scale included on the right 
in the figure. It is seen that results provided by the circuit 
coincide with that anticipated by the ideal model. Each column 
in the figure corresponds to a discrete time instant, starting 
from the instant n = 0, on the left, in which the net is 
initialized. Noise-to-signal power ratio for all input images 
was 113. Convergence is achieved in about four clock cycles, 
which corresponds to about 1 ps computation time. Similar 
results are obtained for the different templates provided by the 
DT reconfigurable prototype. 
The other two prototypes are continuous-time and fixed 
template: one for CCD (1 x 16) and other for noise removal 
(9 x 9). Noise removal prototype uses the full range CT 
model with IQ = 2 p A ,  cascode transistors, and L = 3.2pin. 
Experimental results look very similar to that in Fig. 20, 
convergence time being about 0.25 ps. Here we include 
results only for the CT CCD prototype. This actually consists 
of two prototypes: one made according to the Chua-Yang 
model and other using the full range CT model. Parallel 
loadingldownloading processes through bonding pads are used. 
Internal current replicators are used to achieve simultaneous 
initialization of both prototypes from the same input bonding 
pads. A control signal is used to connect output bonding 
pads to either the outputs of one prototype or the other. 
Rail current is 2 p A  in both cases, and S = 5 (see Fig. 
7(b)) for the Chua-Yang model prototype. Although Monte 
Carlo analysis gives lower yield for the Chua-Yang model 
prototype in case L = 3.2 pm is used, we decided to use this 
geometry for both prototypes, to obtain experimental evidence 
of the yield reduction. Minimum dimension n-channel switches 
are used for cell initialization. Cell areas are close to those 






Fig. 21. Monte Carlo results for the CT full range CCD prototype. 
given in Table 11. No feedthrough cancellation technique is 
required. 
Fig. 21 illustrates the transient evolution of the full range 
prototype for the input state shown on the left. The correct 
final state, shown on the right, is obtained for 30 different 
Monte Carlo trials corresponding to random variations of 
the technological parameters in the full device level layout 
simulations performed. The superposition of signals obtained 
at each cell output for the different trials are shown in the 
middle figures. The last two signals in this set are the net 
initialization signals. It is seen that convergence at the final 
state is achieved in 1.5 ps for the worst case (nominal value 
is 1.1 ps). 
VI. CONCLUSIONS 
Current mode provides powerful tools for the analog im- 
plementation of CT and DT CNN's. Large cell densities with 
good yield are achieved by using proper models and circuits. 
Full range CNN models, allowing all cell variables to 
have the same variation ranges and, hence, better area 
and power consumption figures. 
Intrinsic functional nonlinearity cancellation, giving tech- 
nology independent circuits and architectures, and allow- 
ing the simplification of the IC design process. 
Cascode transistors, to reduce static error terms without 
area and speed penalization. 
Careful consideration of electrical issues related to net 
architecture and inputloutput interfacing. 
Authorized licensed use limited to: Universidad de Sevilla. Downloaded on April 13,2020 at 14:23:30 UTC from IEEE Xplore.  Restrictions apply. 





Input State Border Cells 
.r‘(O) = image 
11‘ = do not care 
Function la 103 D 
6‘ = 0 
U‘ = do not care 
0 Noise 
Removal 
, r ’ ’ ( l I )  = +1 
- I  U “  = image 
Hole [’ ; ‘1 [i; ; ;;I 






.r‘ ( 0 )  = image 
u‘ = image 
-1.8 
1 0 0 0 -0.25 -0.25 -0.2.5 -0.25 2 - 0 . 2 5  -0.25 -0.25 - 0 . 2 3  .rr = do not care 1, ‘. = - 1 
, r< ’ (O)  = +l 
,,c - image ’ 
sC = 0 








, r “ [ O )  = image 1.‘ =fl 
u r  - image ’ ,T‘ = do not care 
For CMOS technology, current as low as 2 / L A  can be 
used with reasonable yield, if strong inversion is used. Speed 
for these low current levels is optimum (about 1 l i s  for 16 
cells CCD) due to the small device dimensions and the low 
impedance of internal nodes. Electrical tunability can be easily 
incorporated using electrically parameterized transconductors, 
for instance, differential amplifiers. Digital tunability is direct 
by switching lines rooted to common nodes. Delay templates 
are in this way easily incorporated. Summarizing, results in 
this paper demonstrate the feasibility of high density, high- 
speed CNN’s in standard digital technologies; also, the new 
CNN mathematical model provided opens new vistas for the 
development of CNN IC’s. 
APPENDIX 
See Table 111. 
REFERENCES 
[ l ]  L. 0. Chua and L. Yang, “Cellular neural networks: Theory,” I€€€ 
Trans. Circuits Syst., vol. CAS-35, pp. 1257-1272, 1988. 
121 -, “Cellular neural networks: Applications,” IEEE Trans. Circuits 
Syst., vol. CAS-35, pp. 1273-1290, 1988. 
[3] S. Matsui and T. Okumoto, “A two-dimensional segmentation-free 
leaming recognition system by a cellular automaton array using eigen- 
vectors of the second moment matrix,” IEICE Trans., vol. E-74, pp. 
2432-2440, 199 1.  
[4] L. 0. Chua and P. Thiran, “An analytical method for designing simple 
cellular neural networks,” IEEE Trans. Circuits Sysr., vol. CAS-38, pp. 
1332.1341, 1991, 
[5] L. 0. Chua and T. Roska, “Stability of a class of nonreciprocal 
cellular neural networks,” IEEE Trans. Circuirs Syst., vol. CAS-37, pp. 
1520-1527, 1990. 
[6] J. A. Nossek et al.. “Cellular neural networks: Theory and circuit 
design,’’ Int. J.  Circuit Theory Applications, 1992. 
171 T. Matsumoto et al., “CNN cloning template: Connected component 
detector,” IEEE Trans. Circuits Syst., vol. CAS-37, pp. 633-635, 1990. 
[8] T. Matsumoto er al., “CNN cloning template: Hole filler,” IEEE Trans. 
Circuits Sysr., vol. 37, pp. 635-638, 1990. 
[9] __, “CNN cloning template: Shadow detector,” IEEE Trans. Circuits 
Syst., vol. 37, pp. 1070-1073, 1990. 
101 J. M. Cruz and L. 0. Chua, “A CNN chip for connected component 
detection,” /E€€ Trans. Circuits Syst., vol. 38, pp. 812-817, 1991. 
1 I] K. Halonen et al., “VLSI implementation of a reconfigurable cellular 
neural network containing local logic,” Int. J. Circuit Theory Applica- 
tions, 1992. 
121 H. Harrer er al., “An analog implementation of discrete-time cellular 
neural networks,” /E€€ Trans. Neural Nehvorks, vol. 3, pp. 466476,  
1992. 
131 A. Rodriguez-Vizquez, R. Domfnguez-Castro. and J. L. Huertas, “Ac- 
curate design of analog CNN in CMOS digital technologies,” in Proc. 
IEEE Inr. Workshop on Cellular Neural Networks and Their Applica- 
tions, pp. 273-280, 1990. 
[14] R. L. Geiger and E. Sbnchez-Sinencio, “Active filter using operational 
transconductance amplifiers: A tutorial,” IEEE Circuits and Devices 
Mug., vol. I ,  pp. 20-32, 1985. 
[IS] Y. Tsividis et al., “Continuous-time MOSFET-C filters in VLSI, IEEE 
J .  Solid-State Circuits, vol. SC-21, pp. 15-30, 1986. 
[16] M. Ismail, S. V. Smith, and R. G. Beale, “A new MOSFET-C universal 
filter structure for VLSI,” IEEE J. Solid-State Circuits, vol. SC-23, pp. 
183-194, 1989. 
[ 171 A. H. Sayles and J. P. Uyemura, “An optoelectronic CMOS memory 
circuit for parallel detection and storage of optical data,” IEEE J.  Solid- 
State Circuits, vol. SC-26, pp. 11  10-1 I 15, 1991. 
[ 181 R. S. Muller and T. I. Kamins, Device Electronicsfor Integrated Circuits. 
New York: Wiley, 1986. 
[I91 E. A. Vittoz, “The design of high performance analog circuits on digital 
CMOS chips,” IEEE J. Sdid-Statr Circuits, vol. SC-20, pp. 657-665, 
1985. 
[ZO] J. B. Hughes, “Switched currents-A New technique for analog sampled- 
data signal processing,” Proc. 1989 IEEE Inr. Symp. Circuits and Sys- 
tems, pp. 1584-1587, 1989. 
I21 J M. J. M. Pelgrom et al., “Matching properties of MOS transistors,” 
IEEE J .  Solid-State Circuits, vol. SC-24, pp. 1433-1440, 1989. 
(221 C. Eichenberger, Charge Injection in MOS-Integrated Sample-and-Hold 
and Switched-Capaciror Circuits.Hartung-Gore Series in Microelec- 
tronics, vol. 3, 1989. 
(231 S. Espejo, A. Rodriguez-Vbzquez, R. Dominguez-Castro, and J. L. 
Huertas, “An adaptive scheme for feedthrough cancellation in switched- 
current techniques,” in Proc. IEEE 1992 Midwest Symp. Circuits and 
Systems, 1992. 
Authorized licensed use limited to: Universidad de Sevilla. Downloaded on April 13,2020 at 14:23:30 UTC from IEEE Xplore.  Restrictions apply. 
I46 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS-11: ANALOG AND DIGITAL SIGNAL PROCESSING. VOL. 40, NO. 3, MARCH 1993 
[24] C. Michael and M. Ismail, “Statistical modeling of device mismatch 
for analog MOS integrated circuits,” IEEE J .  Solid-State Circuits, vol. 
[25] R. Gregorian and G. C. Temes, Analog MOS Integrated Circuits for  
Signal Processing. New York; Wiley, 1986. 
[26] R. Domhguez-Castro, A. Rodriguez-Vhzquez, and J .  L. Huertas, “High 
resolution CMOS current comparators,” in Proc. I992 European Solid- 
State Circuits Conf, pp. 242-245, 1992. 
[27] G. Eros et al., “Optical tracking system for automatic guided vehicles 
using cellular neural networks,” in Proc. Second lnt. Workshop on 
Cellular Neural Networks and their Applications, pp, 216-221, 1992. 
[28] L. 0. Chua et al., “Some novel capabilities of CNN: Game of life and 
examples of multipath algorithms,” in Proc. Second Int. Workshop on 
Cellular Neural Networks and their Applications, pp. 276-28 I ,  1992. 
[29] E. A. Vittoz, “MOS transistors operated in the lateral bipolar mode and 
their application in CMOS technology,” IEEE J.  Solid-state Circuits, 
SC-27, pp. 154-165, 1992. 
-. 
vol. SCI18, pp. 273-279, 1983. 
1301 T. Pan and A. A. Abidi, “A 50-dB variable Rain amplifier using the . .  
parasitic bipolar transistors in CMOS,” IEEf  J.  Solih-State Circuits, 
[31] K. Bult and H. Walling, “A class of analog CMOS circuits based on the 
square-law characteristics of an MOS transistor in saturation,” IEEE J. 
Solid-state Circuits, vol. SC-22, pp. 357-365, 1987. 
[32] E. A. Klumperink and E. Seevinck, “MOS current gain cells with 
electronically variable gain and constant bandwidth,” IEEE J .  Solid-Stare 
Circuits, vol. SC-24, pp. 1465-1467, 1989. 
vol. SC-24, pp. 951-961, 1989. 
Angel Rodriguez-Vazquez (M’80) received the Li- 
cenciado en Fjsica degree in 1977, and the Doctor 
en Ciencias Fisicas degree in 1983, both from the 
University of Seville, Spain. 
Since 1978 he has been with the Department of 
Electronics and Electromagnetism at the Univer- 
sity of Seville where he is currently an Associate 
Professor. He is also with the Department of Ana- 
log Circuit Design of the Spanish Microelectronics 
Center (Centro Nacional de Microelectr6nica). His 
research interests are in analog/digital integrated 
circuit design, including neural, fuzzy and chaotic circuits, linear and nonlinear 
signal processing VLSI circuits, and computer-aided design and modeling of 
analog integrated circuits. 
Servandn Espejo received a five-year degree in 
electronic physic$ (Licenciado en Fisica Electrhica) 
and the M S equivalent in microelectronics from the 
University of Seville, Spain, in June 1987 and July 
1989, respectively 
From 1987 to 1989 he was a research assitant 
at the Department of Electronics and Electromag- 
netism of the University of Seville. From November 
1989 to August 1991 he was an intem in AT&T Bell 
Laboratones at Murray Hill, NJ, and an employee 
of AT&T Microelectronics of Spain He is currently 
a teaching assistant at the Department of Electronics and Electromagnetism 
of the University of Seville, where he is workmg towards a Ph D degree in 
the field of cellular neural networks VLSI implementation His main areas of 
interest are linear and nonlinear analog and mxed-signal integrdted circuits, 
including neural networks electronic realizations and theory, chaotic circuits, 
and communication systems 
Rafael Dominguez-Castrn received a five- ye% de- 
gree in electronic physics (Licenciado en Fisica 
Electr6nica) and the M.S equivalent in mcroelec- 
tronics from the University of Seville, Spain, in June 
1987 and July 1989, respectively 
From 1987 to 1990 he was a research assistant 
at the Department of Electronics and Electromag- 
netism of the University of Seville, where he is 
currently a tedching assistant and working towards 
d PhD. degree 
His research interests are in the fields of ana- 
log/digital integrated circuit design, analog integrated circuits. 
Jose L. Huertas (M’7&SM’91) received the Li- 
cenciado en Fisica degree in 1969 and the Doctor 
en Ciencias Fisicas degree in 1973, both from 
the University of Seville, Spain From 1970 to 
1971 he was with the Philips Intemational Institute, 
Eindhoven, The Netherlands, as a postgraduate stu- 
dent 
Since 1971 he has been with the Department of 
Electronics and Electromagnetism at the University 
of Seville where he is currently a Professor. He 
is also the head of the Department of Analog 
Circuit Design of the Spanish Microelectronics Center (Centro Nacional de 
Microelectr6nica). Hi\ research interests are in the fields of multivalued logic, 
cequential machines, andlog circuit design, and nonlinear network analysis 
and synthesis 
Edgar Sanchez-Sinencio (S’72-M’74-SM’83-F’92) for a photograph and 
biography please see page 155 in this issue. 
Authorized licensed use limited to: Universidad de Sevilla. Downloaded on April 13,2020 at 14:23:30 UTC from IEEE Xplore.  Restrictions apply. 
