Smart-Pixel Cellular Neural Networks in Analog Current-Mode CMOS Technology by Espejo Meana, Servando Carlos et al.
IEEE JOURNAL OF SOLID-STATE CIRCUITS. VOL. 29, NO. 8, AUGUST 1994 895 
Smart-Pixel Cellular Neural Networks in 
Analog Current-Mode CMOS Technology 
S. Espejo, A. Roddguez-Vazquez, Member, ZEEE, R. Domhguez-Castro,  J. L. Huertas, 
and E. Shnchez-Sinencio 
Abstract-This paper presents a systematic approach to design 
CMOS chips with concurrent picture acquisition and process- 
ing capabilities. These chips consist of regular arrangements of 
elementary units, called smart pixels. Light detection is made 
with vertical CMOS-BJT’s connected in a Darlington structure. 
Pixel smartness is achieved by exploiting the Cellular Neural 
Network paradigm [l], [2], incorporating at each pixel location 
an analog computing cell which interacts with those of nearby 
pixels. We propose a current-mode implementation technique 
and give measurements from two 16 x 16 prototypes in a single- 
poly double-metal CMOS n-well 1.6-pm technology. In addition 
to the sensory and processing circuitry, both chips incorporate 
light-adaptation circuitry for automatic contrast adjustment. They 
obtain smart-pixel densities up to 89 units/mm2, with a power 
consumption down to 105 pW/unit and image processing times 
below 2 ps. 
I. INTRODUCTION 
OMMON architectures for image-processing systems use C a front-end sensory plane with digital-encoding of the 
pixel values, and serial transmission of these digital data for 
subsequent processing using either ASIC’s or general-purpose 
computers. Contrary to this approach, smart-pixel chips [3] 
incorporate an analog computing cell at each sensory point, 
achieving high speed and low area occupation in the combined 
sensorylprocessing functions by fully exploiting parallelism. 
The combined spatial distribution of sensory and processing 
circuitry eliminates the time required for data transmission 
from the sensory to the processing plane during the image 
acquisition process. In addition, in some image-processing 
applications, the relevant information contained in the output 
image can be described by a reduced number of variables, 
allowing a fast downloading of the results for subsequent 
evaluation. 
CMOS technologies offer unique features for the design 
of smart-pixel chips. On one hand, MOS transistor operation 
under normal biasing in strong inversion is not drastically 
affected by incident light; on the other, photosensitive CMOS 
devices can be built by exploiting the many junction devices 
available in CMOS technologies [4]. However, previous ap- 
proaches to CMOS design of smart-pixel chips lack generality, 
as they rely on implementation methods suitable for specific 
Manuscript received December 1993; revised April 6, 1994. 
S. Espejo, A. Rodriguez-VBzquez, R. Domlnguez-Castro, and J. L. Huertas 
are with the Centro Nacional de Microelectr6nica-Universidad de Sevilla, 
Edificio CICA, Cmarfia sn, 41012-Sevilla, Spain. 
E. Sinchez-Sinencio is with the Department of Electrical Engineering, 
Texas A&M Universitv. College Station. TX 77843 USA. 
applications. In some cases, the processing-task performed at 
each pixel does not imply collective computation [3], while 
most of the approaches for “pixel-smartness’’ are based on 
active implementations of resistive-grid networks [5], [6]. 
The paradigm of Cellular Neural Networks (CNN) [l] ,  [2] 
is a very suitable framework for systematic design of parallel 
sensory-processing chips. On one hand, CNN’s consist of 
regular arrangements of cells- -topologically identical to smart- 
pixel chips. On the other, their cells are only locally connected, 
and thus, require simple routing. Also, the vast body of litera- 
ture on CNN theory and applications demonstrates outstanding 
features of this paradigm for array-processing [7]. In particular, 
resistive grids have recently been demonstrated as a particular 
CNN class [8]. 
No experimental smart-pixel CNN chips have been reported 
to date. This paper outlines a design approach using Darlington 
phototransistors and current-mode processing circuitry. It is 
based on a modified version of the original CNN model which 
enables optimum speedlpower and area occupation in VLSI 
design [9], [IO]. The sensors include an automatic adjustment 
circuitry which ensures proper behavior under different illu- 
mination conditions. Our proposals are demonstrated via two 
working smart-pixel chips, in a single-poly, 1.6-pm, n-well 
CMOS technology. In addition to their optical input, these 
chips exhibit much better area and speed/power figures than 
previous CNN implementations [ 111, [ 121. 
Section I1 describes some general aspects of smart-pixel 
chips, and Section I11 outlines the proposed computation algo- 
rithm. Sections IV and V discuss the sensory and processing 
circuitry, respectively, and the experimental prototypes are 
described in Section VI. 
11. SMART-PIXEL CHIPS 
In this paper, pixel denotes the elementary sensory unit 
used to detect pointwise light signals. These sensory units are 
realized in CMOS technology using any compatible junction 
device to generate a current whose value is an increasing 
function of the light intensity [3], [ 131, [ 141. The acquisition of 
two-dimensional scenes requires pixels arranged onto regular 
grids, as shown in Fig. 1. Each pixel in this sensory plane 
generates a current I ,  which codifies a corresponding point of 
the input image, where the index c (i.,j) indicates the pixel 
at the ith row and j th  column on the grid and varies over the 
whole grid domain GD(c E GD). Thus, the whole image is 
IEEE Log Number 4402 142; captured into a matrix of currents [Ic] .  
0018-9200/94$04.00 0 1994 IEEE 
Authorized licensed use limited to: Universidad de Sevilla. Downloaded on March 20,2020 at 15:33:44 UTC from IEEE Xplore.  Restrictions apply. 
896 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 29. NO. 8, AUGUST 1994 
from other cells J 
Fig. 1. Illustrating the core architecture of smart-pixel chips. 
Fig. 1 illustrates the architecture of smart-pixel chips: each 
unit (also called smart-pixel or cell ) senses a point of the input 
image and interacts with the other units in the arrangement to 
perform parallel-processing tasks on the input current matrix 
Smart-pixel chips are of strong practical interest for pattern 
[ I C ] .  
recognition problems, to detect features of the input image. For 
example, Fig. 2 illustrates the task of detection of connected 
components (DCC), which consists of counting the number of 
Fig. 2 .  Connected component detection in four different directions. 
- 
connected pieces encountered by scanning an input image in 
a given direction [15]. Pattern recognition can be realized by 
processing the data obtained after performing this task in the 
directions shown in Fig. 2 [16], [17]. This data is contained in 
a few rows and columns at the grid borders. In addition to their 
usage for preprocessing tasks, smart-pixel chips are also useful 
as stand-alone units for nonintensive computation tasks such 
as halftoning [ 181, motion detection [ 191-[21], range-finding 
[3], etc. 
111. THE CNN PARALLEL PROCESSING PARADIGM 
As Fig. 1 illustrates, smart-pixel CNN chips consist of 
regular arrangements of identical units, each including a 
photosensor and an analog computing cell. Such an entity 
transforms the input image [I,] into an output matrix [y,] 
via a dynamic process of interactions among the computing 
cells. The distinctive feature of the CNN paradigm is that these 
interactions are local, limited for each cell to a reduced set of 
neighbors, located within a distance T in the grid. In particular, 
there is a wide catalog of image processing tasks available 
for networks where parameter T (called neighborhood radius) 
is unity-very appealing for VLSI implementations because 
connection among units is made by abutment, requiring no 
extra routing. 
The dynamic computation process of CNN's, as proposed 
in [1], involves three variables per cell: (a) cell state: xc( t ) ,  
which conveys cell energy information as a function of time; 
(b) cell output: yc(t), obtained from the cell state via a soft- 
limiter piecewise-linear transformation, 
(1) Yc = f ( G )  = ;(I.c + 11 - 15, - 11) 
drawn in Fig. 3(a); and (c) cell external-input: U,. Processing 
itself is governed by a set of coupled nonlinear differential 
equations, one per cell. We use equations that differ from those 
originally proposed by Chua-Yang [ 13, and which enable the 
optimization of the speedlpower ratio and area occupation of 
VLSI CNN chips. The proposed equations are given by [9], 
[lo]: 
dz, 
d t  
r- = -  
+ { A , d Y d ( t )  + & d u d )  
d E  A', ( c )  
Vc E GD (2) 
where g (  .) is a nonlinear dissipative term defined as, 
m(5, + 1) - 1 2, < -1 
g(z,) = .c otherwise (3) { m(5, - 1) + 1 5,  > 1 
where m > 1 is a parameter of the model. Function g( . )  is 
drawn in Fig. 3(b). Summations in ( 2 )  extend over the neigh- 
borhood of the cth cell, denoted by NT(c) ,  which contains 
adjacent cells located within a distance T in the grid, and 
includes cell c itself. 
Processing tasks performed by CNN's are determined by 
the convergence of (2) to binary (y, = *l,Vc E GD) 
equilibrium states following the transient initialized by [xc( O ) ] ,  
driven by [U,], and under the boundary conditions imposed 
by cells at the net border. Depending on the application, the 
current 1, generated at each cell's photosensor is used as 
initial value of the state variable zc(0) or as external input 
Authorized licensed use limited to: Universidad de Sevilla. Downloaded on March 20,2020 at 15:33:44 UTC from IEEE Xplore.  Restrictions apply. 
ESPEJO et al.: SMART-PIXEL CELLULAR NEURAL NETWORKS IN ANALOG CURRENT-MODE CMOS TECHNOLOGY 
(b) 
Fig. 3. CNN cell nonlinearities. (a) Output nonlinearity; (b) dissipative term. 
U,. In the later case, the initial states are usually set to a 
constant value. The outcome of the task depends on parameters 
Bcd,Acdr and D, of ( 2 ) ,  called control, feedback, and offset 
parameters, respectively, and on the boundary conditions. 
The control and feedback parameters can be arranged into 
matrices, which provide a pictorial view of the interactions 
within each cell’s neighborhood. For uniform networks these 
matrices are invariant throughout the grid domain-they are 
templates. The functionality of uniform CNN’s is determined 
by its control, B, and feedback, A, template matrices, and its 
offset parameter, D. For illustration purposes, Table I sum- 
marizes the templates used for some significant preprocessing 
tasks. 
To guarantee correct operation of smart-pixel CNN chips, 
an important mathematical issue is to determine conditions of 
the template parameters that yield convergence of the output 
matrix [yC] to binary states for any input. Such a mathematical 
analysis for the model proposed in this paper, given by (2) 
and (3), is out of this paper’s scope and has been reported 
elsewhere [9] for any m 2 1. Our circuits use the particular 
~ 
891 
TABLE I 
SOME CNN TEMPLATES 
Application A B D 
Noise Filtering [! 1 PI [! ! ?I 0 
P I  
0 0 0  0 1 0  
Hole Filling [28] 
Convex Comers 
Extraction [2] 
Borders 
Extraction [2] 
Connected 
Component 
Detection [15] 
Shadow Creation 
1291 
-11.1 -11.1 
-114 2 
-114 -11-2 
-114 2 
-114 -114 
-114 -114 
-1 
-11.1 -3 
-1141  14
-1 1.1 
-2  
0 
0 
case m -+ 00, in which the nonlinear dissipative term forces 
the state variable zc to remain within the interval [-l> 11. 
Consequently, zc( t )  = y c ( t ) ,  and the implementation of the 
nonlinear operator in (1) is not required. 
IV. SENSORY CIRCUITRY 
A. Photosensors 
The simplest photosensitive devices for CMOS n-well tech- 
nologies are reverse-biased photodiodes, formed either directly 
between n+-diffusion and substrate [3] or between well and 
substrate [13]. Current level for both devices is an increasing 
function of the junction area. In particular, we have measured 
currents up to 20 nA for well-substrate photodiodes with well 
area of 100 x 100pm2,  in a 1.6-pm single-poly technology, 
under environmental laboratory lighting. This current level 
increases significantly using a vertical CMOS-BJT as photo- 
sensor. Fig. 4(a) shows a conceptual layout and cross-section 
for this device, whose current is approximately proportional 
to the area of the wellhubstrate junction, A ,  in the figure. 
Current generated by this device is p + 1 times larger than 
that of a photodiode with the same well area 
where IT denotes the phototransistor current, Ilv is the 
corresponding current for the well-diode, and is the transistor 
current-gain; measured in for this technology is 37.7 f 0.8, 
basically independent of transistor geometry [22]. 
We have measured currents up to 430f30  nA (under normal 
laboratory illumination) for phototransistors with passivated 
well area of 60 x 60pm2. Consequently, and since current 
and well area are approximately linearly related, it extrapolates 
current levels of 20 nA for minimum area devices (13.6 x 
13.6 pm2)-needed for increased density smart-pixel chips. 
However, for some critical tasks [7] these levels provided by 
minimum photosensors may not be large enough to guarantee 
the matching level required by the signal processing circuitry, 
Authorized licensed use limited to: Universidad de Sevilla. Downloaded on March 20,2020 at 15:33:44 UTC from IEEE Xplore.  Restrictions apply. 
898 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 29, NO. 8, AUGUST 1994 
I' 
I i 
0 
0 
Fig. 4. 
figuration of two vertical p-n-p transistors. 
(a) CMOS compatible vertical p-n-p tran sistor. (1 J )  Da rlington con- 
thus requiring some amplification. Simplest strategies use 
either larger wells or cascaded current amplifiers-very costly 
in terms of area occupation and, for the latter, inaccurate. In- 
stead, we use an additional vertical BJT to achieve Darlington 
amplification by a factor of p + 1, with practically no area 
overhead. Fig. 4(b) shows the conceptual layout and cross- 
section for this Darlington phototransistor. Current for this 
device is 
1, N (P + 1)1T N (P  + 1)21w. ( 5 )  
while its area occupation is scarcely increased by that of 
a minimum-size vertical BJT. Measurements with A ,  in 
Fig. 4(b) equal to 60 x 60pm2 result in currents up to 
18 * 2pA. 
Fig. 5(a) shows the output characteristic measured from a 
Darlington phototransistor with A ,  = 60 x 60pm2 under 
constant environmental illumination (bright-current), while 
Fig. 5(b) shows the result obtained when the environment light 
is gradually reduced to complete darkness. Dark-current was 
215+ 10 PA, which means that the bright-to-dark current-range 
is close to 100 dB for environmental laboratory illumination. 
The same range is observed for a single p-n-p device, while 
simple photodiodes yield about 80 dB. Although these results 
are optimistic in the sense that in real images there will be 
(b) 
Fig. 5.  Measured output characteristics of a Darlington phototransistor with 
At+ = G O  x G O  pm2: (a) under constant environment illumination; (b) effect 
of gradual reduction of illumination during the sweep of I ;.E. 
no completely dark areas, the bright-to-dark current-ratios 
measured provide a wide enough range for data acquisi- 
tion. The amplification of the Darlington structure provides 
a sufficient current level even if device area is substantially 
decreased. 
B. Autozero Strategy 
Although photosensors produce unidirectional current flow, 
double-rail signals are easily obtained by bias-shifting, as 
shown in Fig. 6(a). Current source ITH sets the zero-level of 
the double-rail signal. To guarantee good contrast, its value 
should be set somewhere between the maximum and the 
minimum light-induced currents among all photosensors. If 
lighting conditions for all possible input scenes are uniform 
and known a priori, ITH can be set to a fixed value. In a more 
general case where the chip must handle scenes with different 
lighting conditions, some kind of auto-zero strategy must be 
devised to generate ITH approximately equal to the average 
of the photosensor currents over the whole array. 
Authorized licensed use limited to: Universidad de Sevilla. Downloaded on March 20,2020 at 15:33:44 UTC from IEEE Xplore.  Restrictions apply. 
ESPEJO et al.: SMART-PIXEL CELLULAR NEURAL NETWORKS IN  ANALOG CURRENT-MODE CMOS TECHNOLOGY 899 
(b) 
Fig. 6.  
current output; (b) auto-zero circuitry. 
Threshold circuitry for photosensors. (a) Bias-shifting for double-rail 
A simple, yet convenient, auto-zero strategy uses four 
extra transistors at each sensor. Fig. 6(b) shows the schematic 
of a sensor including the auto-zero circuitry. All p-channel 
transistors have equal size; the same applies to n-channel 
transistors. The low-impedance node labelled SUM is a global 
node, common to all pixels. Note that the current I ,  at the cth 
photosensor is replicated twice. One of the replicas interfaces 
the processing circuitry, while the other is rooted to the global- 
node SUM, and aggregated to the remaining sensor currents. 
Thus, calculation of the current ITH through transistor MTH 
obtains the following 
where gOp is the output conductance of the p-channel transistor, 
gmn is the transconductance of the n-channel transistor, and 
N the number of pixels. For simplicity (6) assumes equal 
transconductances and conductances for all pixels. The first 
factor in (6) reflects the current division performed at node 
SUM, while the second corresponds to the gain of the mirror 
formed by the parallel combination of the Msuhf transistors 
and MTH.  Assuming gmn >> go,, (6) gives ITH equal to the 
average of the photosensor currents, and the light-threshold is 
automatically adjusted to the average illumination. 
Fig. 7. 
smart pixel. 
Conceptual block diagram for the processing circuitry of a CNN 
V. PROCESSING CIRCUITRY 
A. Basic Circuit Building Blocks 
Fig. 7 is a block diagram for the processing circuitry of the 
cth unit in a smart-pixel CNN chip, according to ( 2 ) .  This 
figure shows a core integrator with nonlinear losses and an 
output structure to generate weighted replicas of the cth input 
U ,  and state x,, for transmission to the neighbor smart-pixels. 
The integrator is driven by weighted replicas of the input and 
state signals of the smart pixels in the neighborhood N,(c), 
plus an offset term, obtaining the following signal to drive the 
core integrator 
Jc( t )  = D, + {Acdxd(t) + B c d ~ d } .  (7) 
Current-mode provides a convenient choice to realize the 
processing circuitry of smart-pixel CNN's. On one hand, it 
enables direct interface with the sensors, whose outputs are 
currents. On the other, current summation at the integrator 
input node is directly achieved by routing signals to a common 
node. Finally, analog operators involved in Fig. 7 (weighted- 
replication, integration, and limitation) are realized by simple 
current mirror circuits. 
Fig. 8(a) realizes the core integrator. Input current J,*(t) is 
an unnormalized version of J c ( t )  in (7), with normalization 
factor I Q : J : ( ~ )  = IQJc( t ) .  Output current zr(t) is the 
corresponding unnormalized version of x, ( t ) .  The parallel 
combination of the diode-connected input transistor A41 and 
capacitor C yields a time constant r = GIs,, where gm is 
the transconductance parameter of M I .  On the other hand, note 
that current x: cannot swing beyond the values of the current 
sources which drive the common output node of transistors 
and M3-meaning that IxZI < IQ.  Thus, analysis of this 
circuit results in: 
d E  A', ( c )  
as required to realize ( 2 ) ,  and where g ( . )  is the function defined 
in (3) with m i CO. In practice 7 does not remain constant, 
but varies with input current level. However, most processing 
tasks tolerate this variation with no degradation of the network 
functionality [9]. 
Fig. 8(b) shows a circuit to realize the output structure of 
Fig. 7 from voltage V,, and using the basic current mirror 
Authorized licensed use limited to: Universidad de Sevilla. Downloaded on March 20,2020 at 15:33:44 UTC from IEEE Xplore.  Restrictions apply. 
900 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 29, NO. 8, AUGUST 1994 
" " ' I 6  
;d( 
6 r C d ' I Q  
I 1 
(b) 
Fig. 8. 
smart pixels. (a) Core integrator; (b) output structure. 
Current-mode circuit blocks for the processing circuitry of CNN 
principle of weighted replication [23]. Note that Fig. 8(b) 
contains two different substructures to cover each possible sign 
of the weight Acd. Positive weights are obtained using a single 
output transistor whose geometry factor is lAcdl times that of 
transistor M I .  Thus, a current A,,x; is sourced to the output 
node. Negative weights require an additional current mirror 
with unity weight for sign inversion. 
B. Some Circuit Design Issues 
The following is a brief comment of dominant nonidealities 
encountered in the practical implementation of smart-pixel 
CNN chips and associated circuits. 
I )  Current Gain Error: A major source of error is the finite 
ratio of the input conductance yi, to the output conductance 
yo of the current mirrors, which causes current gain error due 
to spurious current division. It is especially significant at the 
input node of the integrator, where the gain error t, is given 
approxiniately by [9]: 
regulated mirrors, or a combination of both must be used 
[23]. In particular, analysis shows that the cascode mirror of 
Fig. 9(a) obtains values of g0 /g i rL  several orders of magnitude 
lower than that for single mirrors, with smaller area occupa- 
tion. Chips reported in Section VI are realized using these 
mirrors, and sized to handle the whole input current range 
with minimum distortion and smallest possible devices. For 
mirrors biased by a current IQ, we obtain the following sizing 
equations 
where k ,  = pC0,/2. V,, is the threshold voltage, and Vc.4~ 
is the cascode voltage, which can be generated as shown in 
Fig. 9(b). We assume the same geometries W, and Ln for 
all n-channel transistors in the cascode mirror. W values 
for larger currents (associated to weighted replication) are 
calculated by imposing the constraint that all transistors have 
equal current density. Alternatively, for a given aspect ratio 
W,/L,, (10) establishes a bound for the maximum bias 
current of the device. 
2) Mismatch error; area, power and reliability: Transistor 
geometry ratios, static gain error due to nonnull g o / g z n ,  and 
power dissipation increase with IQ. Hence, a bias current as 
small as possible should be chosen. The issue is to identify the 
minimum feasible rail current value. A lowest limit is certainly 
established by leakage, which in our case is increased by light 
effects. However, a more restrictive bound exists due to MOS 
transistor mismatch and Early voltage ( V , )  degradation with 
channel length. 
Mismatch is produced mainly by variations of V, and 
/3 = p C o , W / L ,  whose standard deviations ~ ( V T )  and o(@) /p  
for devices with equal layout show a component inversely 
proportional to the square root of the channel area, and 
another proportional to the distance between devices [24]. 
However, in the technology used and for transistor pairs closer 
than about 2.5 mm, the distance-dependent component is 
negligible for devices with channel area of less than 100 piii2 
[24]-larger than the values obtained using (10) for bias 
current below -50 pA and channel lengths of 3.2 /mi. Lower 
channel lengths have not been considered for several reasons, 
like short-channel effects, early-voltage degradation, and in- 
creased mismatch effects due to the associated low channel 
areas. In addition, lower transistor geometries do not result in 
appreciable area reductions due to the minimum contact size 
(4 pm with surrounding metal and diffusion in the technology 
used). 
Another important consideration is that for a given o(V'+) 
and g(/?)//?, the ratio a ( I ) / I  in MOS transistors operating in 
strong inversion and after pinch-off has an inverse dependency 
with iigs - V,. This means that once geometries have been set 
to achieve acceptable mismatch levels, bias current cannot be 
E M  ( N +  I)* (9) 
Sin 
where N denotes the number of mirrors driving this node-up 
to 18 for templates with no zero entries on a rectangular 
grid net with unity neighborhood parameter. For improved 
go /g in  figures with short channel devices, cascode mirrors, 
decreased too far below the bound given by (lo), since this 
would produce a low ugs voltage at the bias point, with the 
corresponding large o ( I ) / I .  Hence, mismatch considerations 
establish bounds for both minimum area and power trends. 
3)  Light effects on the processing circuitry: Optical image 
acquisition forces the processing circuitry to be exposed to 
Authorized licensed use limited to: Universidad de Sevilla. Downloaded on March 20,2020 at 15:33:44 UTC from IEEE Xplore.  Restrictions apply. 
ESPEJO et al.: SMART-PIXEL CELLULAR NEURAL NETWORKS IN ANALOG CURRENT-MODE CMOS TECHNOLOGY YO1 
- 
W 
L 
-
..... . . . .... 
lin io + 
W 
L 
-
- W
L 
- W’ 
L 
W 
L 
-
‘W -
L 
W 
L 
-
(b) 
Fig. 9. 
and reference circuitry, (b) cascode voltage generation. 
CMOS bias-shifted mirrors and biasing devices. (a) Cascode mirror 
light, which results in an increase of the leakage currents 
at the reverse-biased substrate-diffusion junctions. Unitary 
bias currents must be sufficiently large in order to neglect 
this effect. Also, MOS threshold voltage depends on light 
intensity, increasing the mismatch effect on current mirrors 
and sources. Current mirror transistors are commonly placed 
nearby, and hence light-intensity gradients have a reduced 
effect. On the contrary, current sources in different cells, 
biased by common global voltages, can exhibit larger 
dispersions. The tolerance of a particular application to 
variations in the unitary bias current must be evaluated 
in general, and local references should be used when 
required. 
Fig. 10. Microphotograph of the DCC prototype. 
VI. EXPERIMENTAL RESULTS 
A.  16 x 16 DCC Prototype 
The following measurements were taken from a 16 x 
16 smart-pixel CNN chip intended for horizontal connected 
component detection [15] (see Fig. 2)-a basic preprocessing 
step for pattern recognition. Fig. 10 shows a microphotograph 
of the prototype, which in addition to the smart-pixel ar- 
ray contains boundary cells, output buffers, bias stages, and 
some digital control circuitry for the output image download- 
ing process. The dimensions of the core array are 1890 x 
1530 pm2, and its power dissipation is 27 mW. The total chip 
dimensions, including the bonding pads, are 2480 x 2500 prn2, 
with a total power dissipation of 42 mW and a total of 24 pins. 
Fig. 11 shows the schematic and layout of one elementary 
unit. Unit dimensions are 118 x 96pm2, which include the 
sensor and associated regulation circuitry (-30% of the area), 
the processing circuitry, an additional current replication for 
output evaluation, and all required routing (cells are connected 
to each other by abutment). Sensor is realized with two 
minimum-size p-n-p devices in a Darlington configuration, to 
produce a bright-current under laboratory lighting of about 
l p A ,  large enough for the matching requirements of this 
application. Cascode structures are used for both current 
sources and mirrors, and except for control switches, all 
MOS transistors have W = 4prn and L = 3.2pm. Power 
dissipation with a 5 V supply and under environmental light 
in the laboratory is lO5bW per cell, unitary current being 
We have obtained 100% success (out of 30 trials) for 
full device level Montecarlo simulation of this chip. These 
Montecarlo simulations are based on the expected variations 
of the threshold voltages VTo and the large signal transcon- 
ductance p (body effect parameter y influences only the 
cascode transistors). Global biasing voltages are used for 
current reference generation, and bias stages are included in 
the simulation. Dispersion due to mismatch among transistors 
of different current sources did not produce critical results. 
Thus, global biasing is a fair approach for this application. 
Fig. 12(a) illustrates the chip measurement setup and 
Fig. 12(b) shows five input images (left column) and the 
IQ = 2pA.  
Authorized licensed use limited to: Universidad de Sevilla. Downloaded on March 20,2020 at 15:33:44 UTC from IEEE Xplore.  Restrictions apply. 
902 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 29, NO. 8, AUGUST 1994 
SET! VPB!VCP!VCN! Object D.U.T. 
(Film Negative) (Chip) 
Work Station Digital Device Testing 
Equipment 
Regulable Focusing 
Light 
Source 
(a) VTH! START! 
(a) 
iov 6/ai* 
9 .*'( 
a 111" 
(b) 
Fig. 11. 
elementary unit of DCC prototype. 
(a) Schematic (refer to text for dimensions) and (b) layout of 
measured output images (right column). The prototype 
was exhaustively tested with 1200 input images. Fig. 12(c) 
contains the output waveforms observed from the cells in 
a particular row of the array during a processing example. 
The input pixels are displayed at the left side column, 
while the output ones are at the right. The signals display 
the measured transient evolution of the output of the cells 
in the row. Measured convergence time is 1 . 6 ~ ~ .  Output 
image downloading requires 8ps, using a 2 MHz digital 
clock frequency for the serial downloading process. Circuit 
operation remains correct, with no speed degradation, if the 
voltage supply is reduced from the nominal 5 V down to 2.7 V. 
This is another positive consequence of using current-mode 
techniques. 
1w 
I 
I 
B. 16 x 16 Radon Transform Prototype 
This prototype performs the Radon Transform [25] of 16 x 
16 pixels input images. This chip accepts electrical, as well as 
optical, input. The processing circuitry is based on a modified 
version of (2), where time has been discretized, and the 
nonlinearity is hard 
x,(n + 1) time 
1, for D, + { A c d ~ d ( n )  + B c d u d }  > 0. (C) 
d € N , ( c )  
Fig. 12. (a) Measurement set up, (b) five input images and the output 
measured from DCC, and (c) measured transient response on a row of cells. 
- 1, otherwise 
(1 1) 
Authorized licensed use limited to: Universidad de Sevilla. Downloaded on March 20,2020 at 15:33:44 UTC from IEEE Xplore.  Restrictions apply. 
ESPEJO et al.: SMART-PIXEL CELLULAR NEURAL NETWORKS IN ANALOG CURRENT-MODE CMOS TECHNOLOGY 903 
FOT!LOAD!LOAD! VTH! C I !  c2! 
Fig. 14. 
Radon Transform prototype. 
Five input images and the corresponding output measured from the 
(b) 
Fig. 13. 
Radon Transform prototype. 
(a) Schematic of elementary unit and (b) microphotograph of the 
Also, this application requires signal-dependent weights [26]. 
In particular, the weights of the contributions going from a 
particular cell c to its neighbors depend on 5,. The complete 
set of CNN coefficients can be described using unidimensional 
templates as follows 
if (z, 2 0) 
A = { [l 0 01, if (zc < 0) [0 0 11, 
B = [ O O O ]  D = O  (12) 
which reflect the scaling factors applied to the contributions 
of a particular cell to its neighbors. 
Fig. 13(a) shows a simplified schematic of a cell, which uses 
pass transistors to realize the delay required in (1 1) and a high- 
resolution current comparator [27] for the hard nonlinearity. 
The design technique and the algorithm used in this circuitry is 
described in detail in [9]. Fig. 13(b) shows a microphotograph 
of the prototype. Cell dimensions are 121 pm x 124 pm, and 
the power dissipated by each cell is 1 mW-significantly larger 
than for the DCC due to the circuitry used to implement the 
hard nonlinearity. 
The system contains a number of blocks located in the 
periphery of the cell array, like output buffers, bias stages, 
and digital control circuitry dedicated to the uploading and 
downloading processes. This additional circuitry, together with 
the bonding pads, result in a total system area of 2670pm x 
2680 pm, and a total system dissipation of 330 mW. The chip 
requires a total of 43 pins. This number is significantly higher 
than that of the previous prototype due to 16 input pads used 
for electrical input image uploading. 
Using a 2 MHz digital clock frequency, image processing 
time is 8 ps. The serial downloading process also requires 8 ps.  
As an example, Fig. 14 shows five input images (left column) 
and the corresponding output images measured from the chip 
(right column). The complete test of the prototype involved 
1200 images. 
VII. CONCLUSIONS 
Summarizing, this paper has outlined a basic model and 
some design issues related to a methodology to design CNN 
smart-pixel chips in digital CMOS processes, and has pre- 
sented measurements from two working prototypes in a 1.6- 
pm n-well CMOS technology. One calculates the number of 
connected pieces (DCC) of an input image in the horizontal 
direction, and the other evaluates the Radon Transform of 
an input image. The DCC chip obtains a density of -89 
smart-pixels per mm2 (each including sensory, regulation and 
processing circuitry), with a power consumption of 105 pLw 
per smart pixel and image processing times below 2 p s .  Area 
and speed figures for the RT chip are similar. Although power 
dissipation is larger for this prototype, this can be corrected 
with a careful design of the current comparator [27]. 
As compared to previous CNN implementations, the 
proposed technique makes the required synergy between 
sensing and processing, and significantly improves area and 
speedtpower figures. In particular, when compared to previous 
Authorized licensed use limited to: Universidad de Sevilla. Downloaded on March 20,2020 at 15:33:44 UTC from IEEE Xplore.  Restrictions apply. 
904 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 29, NO. 8, AUGUST 1994 
chips for the same application [ l l ] ,  [12], the DCC chip, 
apart from including sensors at the cells, reduces the area 
consumption by a factor of 4, and improves the speedpower 
figure by more than one order of magnitude. 
These area and power figures, and the fact that connections 
among pixels are made by abutment (requiring no extra routing 
area) enable forecasting single-die CMOS chips with 100 x 100 
complexity and about 1 W power consumption. 
These designs are mainly oriented towards preprocessing 
tasks which require fixed weights. We feel that there are 
potential application fields for these chips, provided efficient 
integration to massive processors is achieved. For this pur- 
pose, close cooperation between chip designers and system 
developers is necessary. 
ACKNOWLEDGMENT 
The authors wish to thank Ricardo Carmona Galan for his 
work on the design of the Radon Transform Drototype. 
REFERENCES 
L. 0. Chua and L. Yang, “Cellular neural networks: Theory,” IEEE 
Trans. Circuits and Sysr., vol. 35, pp. 1257-1272, Oct. 1988. 
L. 0. Chua and L. Yang, “Cellular neural networks: Applications,” IEEE 
Trans. Circuits and Syst., vol. 35, pp. 1273-1290, Oct. 1988. 
A. Gruss, L. R. Carley, and T. Kanade, “Integrated sensor and range- 
finding analog signal processor,” IEEE J.  Solid-State Circuits, vol. 26, 
pp. 184-191, March 1991. 
E. A. Vittoz, “The design of high-performance analog circuits on digital 
CMOS chips,” IEEE J.  Solid-state Circuits, vol. 26, pp. 657-665, June 
1985. 
H. Kobayashi, J. L. White, and A. A. Abidi, “An active resistor network 
for Gaussian filtering of images,” IEEE J.  Solid-state Circuits, vol. 26, 
pp. 738-748, May 1991. 
P. C. Yu, S .  J. Decker, H. S .  Lee, C. G. Sodini, and J. L. Wyatt, 
“CMOS resistive fuses for image smoothing and segmentation,” IEEE 
J.  Solid-State Circuits, vol. 27, pp. 545-553, April 1992. 
T. Roska and J. Nossek, Eds., “Special issue on cellular neural net- 
works,” IEEE Trans. Circuits and Syst.-I and II ,  vol. 40, March 1993. 
B. E. Shi and L. 0. Chua, “Resistive grid image filtering: Inpuuoutput 
analysis via the CNN framework,” IEEE Trans. Circuits and Syst. I: 
Fundamental Theory and Applicat., vol. 39, pp. 531-548, July 1992. 
S .  Espejo, “VLSI Design and Modeling of CNN’s,” Ph.D. dissertation, 
University of Sevilla, Spain, April 1994. 
A. Rodriguez-Vbzquez. S .  Espejo, R. Dominguez-Castro, J. L. Huertas, 
and E. SBnchez-Sinencio, “Current-mode techniques for the implemen- 
tation of continuous-time and discrete-time cellular neural networks,” 
IEEE Trans. Circuits and Syst. II: Analog and Digital Signal Processing, 
vol. 40, pp. 132-146, March 1993. 
J. M. Cruz and L. 0. Chua, “A CNN chip for connected component 
detection,” IEEE Trans. Circuits and Syst., vol. 38, pp. 812-817, July 
1991. 
H. Harrer, J. A. Nossek, and R. Steltz, “An analog implementation of 
discrete-time cellular neural networks,” IEEE Trans. Neural Networks, 
vol. 3, pp. 466-476, May 1992. 
A. H. Sayles and J. P. Uyemura, “An optoelectronic CMOS memory 
circuit for parallel detection and storage of optical data,” IEEE J.  Solid- 
State Circuits, vol. 26, pp. 11 10-1 115, Aug. 1991. 
C. Jansson, P. Ingelhag, C. Svenson, and R. Forchheimer, “An address- 
able 256 x 256 photodiode image sensor array with 8-bit digital output,” 
in Proc. of ESSCIRC’YZ, Sept. 1992, pp. 151-154. 
T. Matsumoto, L. 0. Chua and H. Suzuki, “CNN cloning template: 
Connected component detector,” IEEE Trans. Circuits and Syst., vol. 
37, pp. 633435, May 1990. 
H. Suzuki, T. Matsumoto, and L. 0. Chua, “A CNN handwritten 
character recognition,” Int. J .  Circuit Theory and Applicat., vol. 20, pp. 
601-612, New York: Wiley, Sept.-Oct. 1992. 
J. C. Bezdek and S .  K. Pal, Eds., Fuuy Models For Pattern Recognition, 
New York: IEEE Press, 1992. 
[I81 K. R. Crounse, T. Roska, and L. 0. Chua, “Image halftoning with 
cellular neural networks,” IEEE Trans. Circuits and Syst. II; Analog and 
Digiral Signal Processing, vol. 40, pp. 267-283, April 1993. 
[ 191 T. Roska, T. Boros, P. Thiran, and L. 0. Chua, “Detecting simple motion 
using cellular neural networks,” in Proc. First IEEE Int. Workshop on 
Cellular Neural Networks and Their Applicat., Budapest, Dec. 1990, pp. 
127-138. 
[20] C. P. Chong, C. A. T. Salama, and K. C. Smith, “Image motion detection 
using analog VLSI,” IEEE J .  Solid-State Circuits, vol. 27, pp. 93-96, 
Jan. 1992. 
[21] W. Bair, C. Koch, A. Moore, T. Horiuchi, B. Bishofberger, and J. 
Lazzaro, “Computing motion using analog VLSI vision chips: An 
experimental comparison among four approaches,” in Proc. Second Int. 
Cor$ on Microelectron. for Neural Networks, Munich, Germany, Oct. 
1991, pp. 291, 309. 
[22] B. Perez-Verdu, F. V. Femit texinputndez, A. Rodriguez-Vazquez, and 
J. L. Huertas, “Modeling and characterization of lateral BJT’a in CMOS 
technologies,” in Proc. of the VI Spanish Congress on Int. Circuits, 1991, 
pp. 75-80. 
[23] C. Toumazou, F. J. Lidgey, and D. G. Haigh, Eds., Analog IC Design: 
The Current-Mode Approach, London: Peter Peregrinus, 1990. 
[24] M. J. M. Pelgrom, A. C. J. Duinmaijer, and A. P. G. Welbers, “Matching 
properties of MOS transistors,” IEEE J,  Solid-State Circuits, vol. 24, pp. 
1433-1440, Oct. 1989. 
[25] C. W. Wu, L. 0. Chua, and T. Roska, “A two-layer radon transform 
cellular neural network,” IEEE Trans. Circuits and Syst.-II, vol. 39, pp. 
488439, July 1992. 
[26] T. Roska and L. 0. Chua, “Cellular neural networks with nonlinear and 
delay-type template elements and nonuniform grids,” Int. J.  Circuit The- 
ory and Applicat., vol. 20, pp. 469-481, New York: Wiley, Sept.-Oct. 
1992. 
[27] R. Domhguez-Castro, A. Rodriguez-VBzquez, and J. L. Huertas, “High 
resolution CMOS current comparators,” in Proc. I992 European Solid- 
Stare Circutis Cant, Copenhagen, Denmark, Sept. 1992, pp. 242-245. 
[28] T. Matsumoto, L. 0. Chua, and R. Furukawa, “CNN cloning template: 
Hole filler,” IEEE Trans. Circuits and Syst., vol. 37,pp. 635-638, May 
1990. 
[29] T. Matsumoto, L. 0. Chua, and H. Suzuki, “CNN cloning template: 
Shadow detector,” IEEE Trans. Circuits and Sysr., vol. 37, pp. 
1070-3073, Aug. 1990. 
Servando Espejo Meana received the Licenciado 
en F k a  degree, an M S .  equivalent in microelec- 
tronics, and the Doctor en Ciencias Fisicas degree 
from the University of Seville, Spain, in June 1987, 
July 1989, and March 1994, respectively 
From 1989 to 1991 he was an intem at AT&T Bell 
Laboratones at Murray Hill, NJ, and an employee 
of AT&T Microelectronics of Spain. He is currently 
a teaching assistant at the Department of Electronics 
and Electromagnetism of the University of Seville, 
and with the Deoartment of Analog Circuit Design 
of the Spanish Microelectronics Center. His main areas of interest are linear 
and nonlinear analog and mixed-signal integrated circuits, including neural 
networks electronic realizations and theory, chaotic circuits and communica- 
tion systems. 
Angel Rodriguez-Vazquez (M’80) received the Li- 
cenciado en F h c a  degree in 1977, and the Doctor 
en Ciencias Fisicas degree in 1983, both from the 
University of Seville, Spain. 
Since 1978 he has been with the Department of 
Electronics and Electromagnetism at the University 
and nonlinear networks, and modeling of analog integrated circuits 
Authorized licensed use limited to: Universidad de Sevilla. Downloaded on March 20,2020 at 15:33:44 UTC from IEEE Xplore.  Restrictions apply. 
ESPEJO et al.: SMART-PIXEL CELLULAR NEURAL NETWORKS IN ANALOG CURRENT-MODE CMOS TECHNOLOGY 905 
Rafael Domhguez-Castro received the five-year 
degree in electronic physics (Licenciado en Fisica 
Electr6nica) in 1987, the M.S. equivalent in mi- 
clpelectronics in 1989, and the Doctor en Ciencias 
Fisicas Degree in 1993, from the University of 
Seville, Spain 
Since 1987 he has been with the Department of 
Electronics and Electromagnetism at the Univer- 
sity of Seville, where he is currently a teaching 
assistant. He is also with the Department of Ana- 
log Circuit Design of the Spanish Microelectronics 
Center (Centro Nacional de Microelectr6nica) His research interests are in 
analog/digital integrated circuit design, including neural and fuzzy circuits, 
and computer-aided design and modeling of analog integrated circuits 
JosC L. Huertas received the Licenciado en F k c a  
degree in 1969 and the Doctor en Ciencias Fisicas 
degree in 1973, both from the University of Seville, 
Spain. 
From 1970 to 1971 he was with the Philips 
International Institute, Eindhoven, the Netherlands, 
as a postgraduate student Since 1971 he has been 
with the Department of Analog Circuit Design of 
the Centro Nacional de Microelectr6nica. His re- 
search interests are in the fields of multivalued 
logic, sequential machines, analog circuit design, 
and nonlinear network analysis and synthesis. 
E. Sanchez-Sinencio photograph and biography not available at time of 
publication. 
Authorized licensed use limited to: Universidad de Sevilla. Downloaded on March 20,2020 at 15:33:44 UTC from IEEE Xplore.  Restrictions apply. 
