Low-Power CMOS Vision Sensor for Gaussian Pyramid Extraction by Suárez Cambre, Manuel et al.
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 1
Low Power CMOS Vision Sensor for Gaussian1
Pyramid Extraction2
M. Sua´rez, V.M. Brea, J. Ferna´ndez-Berni, R. Carmona-Gala´n, D. Cabello, and3
A. Rodrı´guez-Va´zquez Fellow, IEEE4
Abstract5
This paper introduces a CMOS vision sensor chip in standard 0.18 µm CMOS technology for6
Gaussian pyramid extraction. The Gaussian pyramid provides computer vision algorithms with scale7
invariance, which permits to have the same response regardless of the distance of the scene to the8
camera. The chip comprises 176 × 120 photosensors arranged into 88 × 60 processing elements (PEs).9
The Gaussian pyramid is generated with a double-Euler switched-capacitor network. Every processing10
element comprises four photodiodes, one 8-bit single-slope Analog to Digital Converter (ADC), one11
Correlated Double Sampling (CDS) circuit, and 4 state capacitors with their corresponding switches12
to implement the double-Euler switched-capacitor network. Every processing element occupies 44 ×13
44 µm2. Measurements from the chip are presented to assess the accuracy of the generated Gaussian14
pyramid for visual tracking applications. Error levels are below 2% full scale output (FSO), thus making15
the chip feasible for these applications. Also, energy cost is 26.5 nJ/px at 2.64 Mpx/s, thus outperforming16
conventional solutions of imager plus microprocessor unit (MPU).17
Keywords18
CMOS Vision Sensors, Gaussian Filters, Image Pyramids, Switched-Capacitor Circuits, Per-Pixel19
Processing20
This work has been funded by ONR N00014-14-1-0355 and Spanish government projects MINECO TEC2015-66878-
C3-1-R&TEC2015-66878-C3-3-R, Junta de Andalucı´a, Proyectos Excelencia- Conv. 2012 TIC 2338, EM2013/038 (FEDER),
EM2014/012. The authors kindly acknowledge the Univ. of California for the Software Transfer Agreement UC Case No.
2014-453 and doctor Steffen Gauglitz for his assistance with the software framework to extract metrics for visual tracking.
M. Sua´rez is with Atomos GmbH, Villingen, Germany. Contact: manuel@atomos.com.
V.M. Brea and D. Cabello are with the Centro Singular de Investigacio´n en Tecnoloxı´as da Informacio´n (CITIUS), University
of Santiago de Compostela, SPAIN. Contact: victor.brea@usc.es.
R. Carmona-Gala´n is with Instituto de Microelectro´nica de Sevilla (IMSE-CNM), CSIC, SPAIN.
J. Ferna´ndez-Berni and A. Rodrı´guez-Va´zquez are with the University of Seville, SPAIN, and with Instituto de Microelectro´nica
de Sevilla (IMSE-CNM), CSIC, SPAIN. Contact: angel@imse-cnm.csic.es.
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 2
I. INTRODUCTION21
The integration of camera systems for vision applications benefits from performing scene22
analysis right at the sensor front-end. Such pre-processing may extract scene features and hence23
reduce the number of data transmitted off the sensor chip for further processing. This is quite24
a relevant characteristic because images contain many spare data, and data transmission and25
storage consume significant energy and area. Also, pre-processing and reduced data transmission26
result in increased throughput. Actually, pre-processing is smartly implemented in natural vision27
systems [1], [2]; a fact that has motivated authors to explore architectures for CMOS imaging28
front-ends with per-pixel processing circuitry [3]–[7]. These systems are recently making the29
transition from academic proof-of-concept prototypes to industrial products [8].30
Sensory-processing front-end chips with per-pixel processors operate typically as Single In-31
struction Multiple Data (SIMD) processors, namely, all processors run concurrently the same32
operation on the data captured by the pixel photosensors, thus accelerating computation. Also,33
mixed-signal per-pixel processors provide speed advantages with large energy efficiency [9], [10].34
As a result, image sensors with embedded mixed-signal processors emerge as suitable candidates35
for the front-end of vision systems with optimum SWaP (Size, Weight and Power) figures and36
large throughput. Throughout the paper we will use the term CVIS (CMOS VIsion Sensors) to37
refer to image front-end devices with embedded analysis capability and, we will retain the term38
CIS (CMOS Image Sensors) for conventional image front-ends conceived to deliver just images.39
Major points hampering further development of CVIS-SIMD are: i) their outcome may not40
be compatible with computer vision software tools, thus limiting their acceptance by system41
engineers and integrators; ii) reduced fill-factor when realized in standard 2D technologies; iii)42
large pitch, and hence smaller resolution than CIS per given form factor, again in standard43
2D technologies. Nevertheless, the loss of resolution and image quality of CVIS-SIMD are44
not insurmountable barriers for vison. Nature also teaches lessons in this regard; for instance,45
patients with retinitis pigmentosa see with a small fraction of their photoreceptors alive [11],46
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 3
which suggests that large pixel counts may not be a must. Indeed, resolutions as low as 32 × 3247
pixels suffice to get the gist of complex scenes [12] and have been demonstrated for indoor elderly48
care [13]. Also, commercial sensors with low pixel counts (QCIF: 176 × 144) are produced for49
machine vision applications [14] and have been demonstrated for adaptive laser welding [15],50
among other applications. Also, reduced fill-factor may be overcome with controlled illumination,51
as it actually happens in many machine vision applications [16]. Furthermore, many computer52
vision algorithms cope with inaccuracies arisen during processing [17], [18], thus easing the use53
of mixed-signal CVIS-SIMD. As an example, the chip in [19], that runs the earliest stages for54
face detection using the algorithm in [17], tolerates processing errors close to 10%. As shown in55
Section IV.D, chip measurements in this paper show that inaccuracies in the Gaussian pyramid56
are low enough as not to be a concern for visual tracking.57
Regarding compatibility with computer vision tools, it can be met by aligning the conception58
of CVIS-SIMD to standard computer vision procedures [20]. Particularly, by focusing on the59
embedding of pre-processing functions customarily used by computer vision system engineers.60
This is actually the case of image pyramids, such as the Gaussian pyramid [21]. Image pyramids61
are found at the initial stages of the processing vision chain for a large variety of computer vision62
applications and algorithms such as the Scale Invariant Feature Transform (SIFT) and variations63
thereof. Their calculation is resource intensive because it involves repetitive operations with the64
whole set of image data. As a consequence, the potential benefit of calculating them with CVIS-65
SIMDs is huge. CVIS-SIMDs may represent a first step towards embedding complete computer66
vision on a single die with vision capabilities into SWaP sensitive systems such as vision-enabled67
wireless sensor networks [22] or unmanned aerial vehicles [23].68
From now on we will use the acronym PE (Processing Element) for the elementary cell of69
CVIS-SIMD image front-end chips. This paper reports a 0.18 µm CMOS sensory processing70
chip to extract the Gaussian pyramid with per-pixel processing circuitry, ADC and Correlated71
Double Sampling (CDS). It contains 176 × 120 3T APSs arranged into 88 × 60 PEs; i.e.72
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 4
four photosensing points per-PE. Gaussian filtering is realized by using a diffusive, double-73
Euler, switched-capacitor grid. The chip operates at 2.64 Mpx/s with an energy consumption of74
26.5 nJ/px (0.6 µJ/frame), thus outperforming conventional architectures of imager and MPU by75
several orders of magnitude. Measurements show errors below 2% FSO versus Gaussian pyramid76
computed by software [24]; these errors are tolerated by vision applications.77
II. GAUSSIAN PYRAMID EXTRACTION78
A. Basic Concepts79
The scale-space enables computer vision algorithms to give the same response regardless of80
the distance between camera and object. A common function for scale-space generation is the81
Gaussian filter [25], [26]. The scale-space is a function L(x, y, σ) resultant from the convolution82
of a variable-width Gaussian function with an input image I(x, y):83
L(x, y, σ) =
1
2piσ2
e−
x2+y2
2σ2 ∗ I(x, y) (1)
where ∗ is the convolution operator, σ is the width of the Gaussian function, and x, y are the84
spatial coordinates of the image.85
The Gaussian pyramid, illustrated in Fig. 1, consists of several scale spaces arranged into86
octaves. Starting from the bottom, images within each new octave have all one quarter the87
resolution of those in the previous octave. Subsampling is hence made in the transition from88
each octave to the next one. Regarding images contained within each octave, these images are89
scales obtained through Gaussian filtering with increasing widths. The width of each new scale90
is k times larger than that of the previous one. The range of scale widths is the same for all91
octaves, namely, from σ0 to 2σ0. The width σ0 is application-dependent, and as such it could be92
selected by the user. Usually three octaves with six scales each suffice [21]. At hardware level,93
the issue is to provide accurate widths σi of the Gaussian function.94
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 5
B. Hardware Implementation95
The Gaussian function gives the value Iij of each pixel as the solution of a first-order differential96
equation under the driving force of the values of the four neighboring pixels along the cardinal97
directions, namely,98
dIi,j
dt
= D(Ii+1,j + Ii−1,j + Ii,j+1 + Ii,j−1 − 4Ii,j) (2)
which is actually the continuous-time heat differential equation [27], with D being the diffusion99
coefficient, usually a constant value common to all the pixels in the image space. In the case of100
the Gaussian pyramid, D determines the degree of blurring through the expression σ =
√
2Dt,101
where variable t is the time. In our case, pixel values are voltages Vij held at state capacitors102
of capacitance C, and pixels are connected to the four neighbors through resistive links with103
resistance R. In such a case, Eq. (2) transforms into,104
C
dVi,j
dt
=
Vi+1,j + Vi−1,j + Vi,j+1 + Vi,j−1 − 4Vi,j
R
(3)
from where D = 1/RC, and the filter width σRC =
√
2t/RC. Resistance R can be implemented105
either through TMOS transistors operating in ohmic region, giving rise to RC networks, or through106
switched-capacitor (SC) networks. Fig. 2 illustrates both implementation styles. The former are107
inherently more non-linear than the latter. Also, RC networks need sampling mechanisms to stop108
the transient evolution of the network and thereby set the width [4]. The non-linearity of active109
resistive links and the time uncertainty of sampling mechanisms degrade the accuracy of the110
diffusion process in RC networks. These problems can be overcome by emulating resistive links111
through switched-capacitors, giving rise to the so-called diffusive SC networks.112
There are many different SC topologies to run Gaussian filters [28]. Fig. 2(b) and 2(c) display113
simple- and double-Euler SC networks in 1D. In both cases an exchange capacitor CE is sampled114
by two switches driven by two non-overlapping clock signals φ1 and φ2 (Fig. 2(d)). The Gaussian115
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 6
pyramid provided by the double-Euler configuration yields better figures of merit than those of116
the simple-Euler SC topology when included in the SIFT algorithm [29]. Hence, the double-Euler117
is the SC network implemented on the CVIS-SIMD presented in this paper.118
Assuming, as in any SC circuit, that transients associated with the ON resistances of the119
switches are neglected, that all state capacitors have the same capacitance C, and that CE1 =120
CE2 = CE , the equivalent impedance of the double-Euler SC topology is R = Tclk/nCE , where121
n is the number of clock cycles, and Tclk is the clock period. The resultant σSC , the Gaussian122
width of the double-Euler SC topology across the number of clock cycles, becomes:123
σSC =
√
2nCE
C
(4)
Eq. (4) can be used to set the Gaussian width by design. However, deviations may be observed124
during fabrication that depend on the actual device employed to implement CE and C. It is hence125
convenient to extract the on-chip σSC value through measurements. Extracted values might be126
used for calibration if needed. The extraction procedure of on-chip σSC for our chip will be127
addressed in Section IV-B.128
III. CHIP DESIGN129
A. Chip Floorplan and Processing Elements130
The micrograph at the left in Fig. 3 shows the chip floorplan, consisting of a core array of PEs131
surrounded by a split frame buffer. The core array includes 88 × 60 PEs. Each PE comprises:132
i) 4 3T-APS pixels - spatial resolution regarding image acquisition is hence 176 × 120; ii) a133
comparator for in-PE A/D conversion; iii) 4 state capacitors, and a CDS circuit, which is also134
used as part of Local Analog Memories (LAMs) to store either the acquired scene or a given135
scale across the Gaussian pyramid, and; iv) the double-Euler SC network made up of intra and136
inter-PE switches for NEWS connectivity. The inset at the right of Fig. 3 is a close-up of the137
PEs, where photodiodes and capacitors of the double-Euler diffusion network are visible.138
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 7
Per-PE ADC and per-PE CDS, instead of the conventional per-column approach, increase par-139
allelism. Also, this strategy gets favoured by the re-targetting of the herein proposed architecture140
to vertical technologies, leading to better performance metrics [30], [31]. Circuit sharing through141
the use of the same devices for different functions along time in part compensates for the per-PE142
ADC and CDS area overhead. Larger routing from the per-PE and per-CDS is alleviated by143
laying down the frame buffer that stores the results from the A/D conversion in two halves at144
the top and bottom of the PE array, which in turn diminishes power consumption.145
B. PE Array Configuration146
The PE array changes its configuration according to the function realized by the chip. Fig. 4147
conveys such configurations. The coordinates in the PE array are indicated within brackets. The148
origin of the coordinates is the PE at the top left corner. State capacitors of the double-Euler SC149
network in every octave (Ok) are expressed as Cpij Ok.150
The input image and the scales in the first octave are stored at state capacitors (Cpij O1). As151
seen in Fig. 4(a) and Fig. 4(b), as there is only one ADC and CDS circuit per 4 pixels and 4 state152
capacitors, image acquisition and scales read-out are performed for 4 cycles. State capacitors are153
shunted across octaves. Fig. 4(c) shows the configuration during the second octave. In this case,154
the state capacitors of a PE are combined into only one to perform downscaling, which leads155
to one-to-one state capacitor per CDS and A/D circuit in the PE array. In the third octave, the156
state capacitors of 4 PEs are merged, and again there is a one-to-one state capacitor per CDS157
and A/D circuit. The read-out of the input image and the 18 scales resultant from 3 octaves and158
6 scales each amounts to 40 A/D conversions of the PE array for the whole Gaussian pyramid.159
C. Circuit Implementation160
Fig. 5 shows a circuit view of the PE with its time diagram. Table I lists the sizes of the161
transistors in Fig. 5. Switches are implemented with NMOS transistors with minimum dimensions.162
Circuit sharing is performed with amplifier A1, capacitors C and Cpij . Every 3T-APS pixel has its163
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 8
corresponding capacitor Cpij . This is shown in Fig. 5 with the same gray color. Capacitor C runs164
CDS and offset-compensation comparison during A/D conversion. Amplifier A1 and capacitors165
Cpij are part of LAMs and CDS circuits. The latter are also part of the state capacitors Cpij Ok166
in the SC network.167
The gain stages in the PE are double-cascode topologies. Only one amplifier is included for168
CDS and image storing in the LAMs, while two are required in the comparator of the A/D169
converter. The amplifier can be configured in two modes of operation, namely IA and IB,170
shown in Fig. 6(a) and Fig. 6(b), respectively. In both cases the current can be cut off through171
enable ports. Switches driven by enable ports increase their output impedance close to the end172
of the operating range of the amplifier, increasing the gain too. Configuration IB consumes up173
to 30% less power than IA at the cost of a narrower input range by shunting the port enable n174
to the input voltage Vin (Fig. 6(d)). The bias current of both configurations is set to 1 µA by Vbp175
through a wide-swing constant transconductance bias circuit trimmed with an external resistor176
[32], leading to a gain above 60 dB in the voltage range [0.4, 1.3] V with mismatch and Process-177
Voltage-Temperature (PVT) variations (Fig. 6(c) collects nominal simulations). Bode plots are178
shown in Fig. 6(e).179
1) Image Acquisition: The photodiode is an n-well over p-substrate structure in orden to180
enhance the spectral response at longer wavelengths. The bias current of the source follower181
of the 3T-APS is set to 1 µA by M4 through a transconductance circuit with an external resistor.182
CDS is included to diminish reset noise and FPN from mismatch [33]. The nominal working183
range for the output voltage of the CDS circuit is defined by amplifier A1 in Fig. 5, namely;184
[0.4, 1.3] V. These are the lower and upper bounds for the voltages at the state capacitors of the185
double-Euler SC network.186
Fig. 7 shows the CDS topology with its control signals. A similar implementation has been used187
for instance in [34]. For a given pixel ij, signal φrw pij is high during the whole acquisition time.188
Reset and signal voltages for CDS are sampled at time instants t0 and t1 with signal φacq high. The189
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 9
CDS output is stored in Cpij , as well as in C ′pij and the four exchange capacitors CE connected190
to the node nij . Signals φ1 O1 pij and φ1 pij set the initial values in the exchange capacitors used191
for intra-PE and inter-PE connections in the double-Euler SC network, respectively.192
The CDS is implemented with amplifier A1 in IA mode to support a wide input voltage range.193
Enable signal φen inv1 allows switching off amplifier A1 between the two samples at t0 and t1.194
By assuming large enough gain A1, the CDS output voltage is given by:195
Voutij = Vref +
C
Cpij
[VPij(t0)− VPij(t1)] (5)
where Vref = 400 mV.196
2) Local Analog Memories (LAMs): The LAMs store both the image after CDS and the scales197
across the Gaussian pyramid. The LAMs are implemented with amplifier A1, capacitors Cpij ,198
and switches φwritep, φrdm and φwrite0 (see Fig. 5). Scales across the Gaussian pyramid are stored199
and read out in two phases with signal φrw pij high and φvref cds low. Both phases are shown in200
Fig. 8. During the first phase voltage Vnij − VQ is held in capacitor Cpij with signal φrdm high,201
and φwritep and φwrite0 low. The read-out is performed during the second phase with φrdm low202
and φwrite0 and φwritep high, leaving Voutij = Vnij , where Vnij is the voltage at node nij.203
3) Comparison for in-PE ADC: Our chip embeds an 8-bit single-slope in-PE ADC. Fig. 9204
shows the single-input offset-compensated comparator of the in-PE ADC. Offset-compensation205
makes the comparator less sensitive to manufacturing variability. Switches are implemented with206
NMOS transistors. Their sizes are collected in Table II. Label M15 means the four transistors in207
the NAND gate of the comparator, which is implemented with complementary logic. Amplifier208
A2 is configured as IA, while A3 is in mode IB to cut power consumption; further decreased209
with the feedback loop between both gain stages. The bottom sampling technique is run with210
different delays between signals (Delay1 - Delay3 in Fig. 9).211
The comparator works in two phases: reset and comparison. During reset, both the first input212
signal and the quiescent point of the first amplifier in the comparator are sampled. This is done213
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 10
with signals φcomp rst and φwrite high. The reset phase ends by setting φcomp rst and φwrite low,214
leaving VQ-Voutij across C. Voutij can be either the input image with CDS or a given scale of the215
Gaussian pyramid. This voltage is compared to the voltage ramp Vramp during the comparison216
phase, which starts with φcomp and φramp read high, giving Eq. (6) at the output of the second217
gain stage. The static power consumption can be cut during reset with φcomp low and φen comp218
high. The comparator takes a falling ramp as input in the comparison phase with a downfall ∆219
of signal Vramp at VOH = 1.3 V to ensure a correct initial state for values of Voutij close to Vdd.220
Vout2 = K
2(Vramp − Voutij) + VQ (6)
The Voutij - Vramp crossing triggers the signal End-of-Conversion (EoC) to low, enabling the221
writing of a digital word given by an 8-bit counter into the frame buffer assigned. The end of222
conversion occurs with Vout2 low (see Fig. 9 and Eq. (6)), which in turn cuts off current in the223
first gain stage through a positive feedback loop. The feedback loop also reinforces logic levels.224
Voltage and current waveforms in the first amplifier of the comparator (Vout1 in Fig. 9) with and225
without feedback loop plotted in Fig. 10(a) confirm this statement. Fig. 10(b) and (c) illustrate226
power savings from the feedback loop for two input voltages, corresponding to ADC output227
codes 250 and 40, close to the lower and upper parts of the falling ramp. Blue and pink lines are228
the currents integrated along the whole ramp in the first and second amplifiers of the comparator.229
The comparator without feedback loop consumes 1.65 µW and 1.7 µW for codes 250 and 40,230
respectively; the feedback loop leads to 75 nW and 1.65 µW, resulting in large power savings231
for the largest ADC output codes.232
4) Gaussian Pyramid Construction: Our double-Euler SC network with NEWS connectivity233
yields the Gaussian pyramid. Intra- and inter-PE connections are shown in different gray colors234
in Fig. 5. Fig. 11 gives a complete view of both intra- and inter-PE connections.235
Downscaling across octaves in the Gaussian pyramid leads to three types of switching blocks236
in the SC network, labeled SCA, SCB and SCC in Fig. 11, all of them implemented as NMOS237
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 11
transistors with minimum dimensions. In addition, one out of four PEs has a slightly different238
structure from the other three. Such a PE is shaded and marked with β in Fig. 11. PEs of α239
type comprise switching blocks SCA and SCB. PEs of β type contain switching blocks SCA240
and SCC . The scales are provided by capacitors Cpij Ok. Cpij O1 means any of the 176 × 120241
state capacitors in the first octave. Similarly, Cpij O2 and Cpij O3 mean any state capacitor in the242
second and third octaves, where the resolution is downscaled to 88 × 60 and 44 × 30 pixels,243
respectively. Fig. 12 summarizes the states of the control signals across the Gaussian pyramid.244
State capacitors Cpij O1 in the first octave are the combination of MiM structures of M5-M6245
metal layers Cpij with capacitors realized with transistors C ′pij in order to keep dynamic errors246
low, leading to Cpij O1 = 330 fF. Capacitors C ′pij are isolated from the SC network during LAMs247
read-out through signal φread net, leaving Cpij = 200 fF for these functions (see Fig. 5). Exchange248
capacitors in the first octave are set to CE = 38.5 fF and realized with transistors. According249
to Eq. (4), the state to exchange capacitors ratio yields σSC O1 = 0.48
√
n for the scales in the250
first octave, with n being the number of clock cycles. Such scales are built with blocks SCA,251
SCB and SCC . Blocks SCA run the two terms of the Gaussian kernel with NEWS connectivity252
through the switches that connect state capacitors within a given PE. The other two terms of253
the Gaussian kernel are executed with blocks SCB or SCC , correspondingly providing inter-PE254
connectivity of a given state capacitor with its neighbors. As an example, and as seen in Fig.255
5, the state capacitor which results from merging Cpij with C ′pij into Cpij O1 within the the first256
octave is connected to its eastern and southern neighbors through SCA within the PE, while their257
northern and western connections comprise blocks SCB in PEs of α type, and blocks SCC in258
PEs of β type. Finally, signals φ1 and φ2 in the basic cell of the double-Euler SC network of259
Fig. 2 are implemented with signals φ1 O1 pij and φ2 O1 in blocks SCA, φ1 pij and φ2 O1O2 in260
blocks SCB, and φ′1 pij and φ
′
2 O1O2 in SCC . φ1 O1 pij , φ1 pij and φ
′
1 pij are turn on to initialize261
CE and C ′pij capacitors during image acquisition through CDS in every PE with signal φread net262
high, as seen in Fig. 5 and Fig. 7.263
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 12
The 1/4 downscaling from the first to the second octave occurs by shunting the four state264
capacitors Cpij O1 of the first octave with the 8 intra-PE exchange capacitors CE , giving rise to265
larger state capacitors throughout the second octave as Cpij O2 = 4Cpij O1 + 8CE for a given PE.266
In so doing, signals φ1 O1 pij and φ2 O1 in blocks SCA are always high in the second octave.267
Signals φrw pij , φrw pij+1, φrw pi+1j , and φrw pi+1j+1 are also high to shunt capacitors Cpij in the268
PE (see Fig. 5). Signals φ1 and φ2 in the basic cell of the double-Euler SC network of Fig. 2269
are now given by the pairs φ1 pij and φ2 O1O2, and φ′1 pij and φ
′
2 O1O2 in blocks SCB and SCC ,270
respectively. Signals φ1 pij and φ′1 pij are used to initialize exchange capacitors for the second271
octave with blocks SCB and SCC . Also, as seen in Fig. 11, the NEWS connectivity for PEs of272
α type is given by two SCB blocks along each direction. Similarly, two SCC blocks along each273
cardinal direction are used for PEs of β type. This means that now the exchange capacitors for274
the second octave become 2CE . All in all leads to σSC O2 = 0.23
√
n.275
Finally, the 1/4 downscaling from the second to the third octave is carried out in two phases.276
During the first step the four state capacitors Cpij O2 of 4 PEs are shunted together through signals277
φ1 pij and φ2 O1O2 high in blocks SCB. Subsequently, these signals turn low, disconnecting PEs278
of β type from those of α type in every group of 4 PEs. As a consequence, the scales in the third279
octave are performed among PEs of β type through blocks SCC , where φ′1 pij and φ2 O3 play280
the role of control signals φ1 and φ2 in the basic cell of the double-Euler SC network of Fig. 2.281
Initialization of state capacitors is carried out with φ′2 O1O2 high. In this scheme, both exchange282
and state capacitors remain the same as in the second octave, so that σSC O3 = σSC O2.283
D. Peripheral Circuits284
1) Gaussian Pyramid Read-Out: The Gaussian pyramid is read out through two frame buffers285
laid down at the top and bottom of the PE array, and labeled ’1/2 frame buffer’ in Fig. 3. Every286
register bank is assigned to the corresponding half of the PE array. The frame buffer split in two287
halves diminishes routing area.288
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 13
Fig. 13(a) shows the 1/2 frame buffer. Every PE has two 8-bit registers assigned in the frame289
buffer, allowing the read-out and A/D conversion of two pixels at the same time. Such registers290
are named A and B in Fig. 13(b). Every frame buffer of the half PE array of 88 columns and291
30 rows comprises 352 columns and 15 rows of registers. The 60 registers of a column of 30292
PEs are placed in 4 columns of 15 rows each with the sequence ABAB... of Fig. 13(b). As an293
example of read-out procedure, for the first column of PEs of the bottom half array- PEs across294
the 30th to the 59th row- the PEs from the 30th to the 44th row are A/D converted in column 0295
in the register bank, while the PEs from the 45th to the 59th row are A/D converted in column296
2 (both of them in reg. A in Fig. 13(b)). At the same time, the data converted in the previous297
cycle are read out of the chip in columns 1,3... (reg. B in Fig. 13(b)). Signal Reg select allows298
selecting one of the two 8-bit registers, either A or B, yielding the A/D conversion. Finally, the299
4-bit and a 9-bit row and column decoders are NOR MOS decoders with pull-up transistors.300
The signal EoC from the in-PE comparator enables writing of the digital word generated by301
a global counter into the registers, which are implemented with an NMOS transistor at the input302
and a PMOS transistor in their feedback loop (Fig. 13(c)). The 8-bit register of a word includes303
a tristate at the output as showed in Fig. 13. The row decoder enables these tristates in a full304
row and all write the stored word in a per column vertical bus. Another tristate placed at the end305
of each column selects the column that must be read. The column tristate writes the data in the306
bus that drives the digital word to a buffer. This buffer reinforces and drives the 8-bit word to307
the output paths digou and digod (Digital Output Up/Down).308
2) Analog Ramp and Voltage Bias Generation: The analog ramp for the 8-bit single-slope309
A/D converter is produced with an 8-bit current steering D/A converter [35]. The D/A converter310
is laid down at the left of the PE array in Fig. 3. The unity current for the D/A converter is311
set to 2 µA. The current from the D/A is converted to voltage in an external resistor. The D/A312
also comprises a 5-bit current steering to set up the offset of the ramp. Finally, the bias voltage313
generators of the gain amplifiers in the PE are implemented with wide swing transconductance314
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 14
amplifiers included on the left side of the die, within the block labeled ’Ana. Ramp’ in Fig. 3.315
IV. EXPERIMENTAL RESULTS316
A. Camera Module Prototype317
Fig. 14 shows a camera module prototype composed of three interconnected boards. The first318
of them (carrier board) hosts the sensor chip (FPGP). The second board encloses an FPGA DEO-319
nano [36] to control the chip. The last one is a microPC (Raspberry Pi [37]) for visualization320
purposes. The optics is a C-mount type 35mm@f1/4 lens. The system is powered to 5 V through321
a plug Jack/µUSB type.322
B. On-Chip Gaussian Pyramid323
The chip operation depends on the value of the emulated Gaussian filter width, σSC . This is324
set during design through capacitors C and CE with Eq. (4), where n stands for the number of325
clock cycles. Nevertheless, σSC may change during physical realization. Fig. 15 displays changes326
measured from the chip. The black line shows the designed σSC as a function of the number of327
clock cycles n. The blue line shows the σSC values of the scale-space extracted by iteratively328
comparing the outcome of the chip across the number of cycles n to an ideal scale-space L(x, y, σ)329
on the image acquired by the chip through RMSE minimization. The red line is a polynomial330
fitted to the measured values. This experimental curve fits Eq. (4) by using exchange capacitor331
values of CE ≈ 28 fF and CE ≈ 26.5 fF for the first and second octaves, instead of the designed332
ones, i.e. CE = 38.5 fF, due to tolerances and parasitics, which do not destroy chip functionality.333
It should be noted that both the exchange capacitors CE , and part of the state capacitors C ′pij334
are implemented with transistors, while part of the state capacitors Cpij are MiM devices (see335
Fig. 5). Deviations among the experimental scales and scales designed with Eq. (4) are below336
1% of the full scale, as it is illustrated by the right vertical axis in Fig. 15, where it is seen that337
the RMSE saturates around 2.5 in a scale of 255 (1% of FSO). Finally, Fig. 16 further illustrates338
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 15
the outcome of Gaussian filters realized by the chip by showing different scales obtained within339
the first octave.340
C. Implementation Comparison341
The chip generates a Gaussian pyramid of 3 octaves with 6 scales each in 8 ms. Time required342
for A/D conversion is included in this number. Thus, the chip can provide 125 digitally-encoded343
pyramids per second. Data conversion takes 200 µs per conversion and the clock cycle for the344
double Euler SC network is 150 ns. Relative energy consumption and throughput of our chip are345
26.5 nJ/px at 2.64 Mpx/s.346
Table III compares these metrics versus those provided by systems where Gaussian pyramids347
are obtained through digital signal processing following sensor read-out. Since some of these348
systems do not embed image sensors, energy for conventional CMOS imagers [38] scaled to the349
image resolution of the corresponding processor have been added for proper comparison.350
Energy data in Table III do not include external memory accesses as they largely depend on the351
camera system. Their forecast would hence be inaccurate, and similar for all the Gaussian pyramid352
sensory-processing subsystems, including ours. Our chip is up to four orders of magnitude better353
than conventional and low-power MPUs in computer performance (Mpx/J), while the throughput354
is similar to that of the most efficient competitor.355
Table IV further illustrates the performance of the chip versus other highly efficient sensory-356
processing CVIS chips with per-pixel circuitry. The chip in [6] performs 2D optic flow estimation.357
The PE array evaluates temporal contrast change by substracting two frames whose gains are358
set by a programmable gain amplifier. The chip in [42] runs 3 × 3 convolutions. The chip in359
[43] performs general purpose low-level image processing. Finally, the chip in [44] performs360
background subtraction. These functions are simpler than the generation of a Gaussian pyramid361
with 3-octaves@6-scales performed by the herein reported chip.362
Still, the chips in [42] and [43] might compute Gaussian filters, as these are weighted con-363
volutions. The metrics in Table IV correspond to isolated pairs of convolutions as Roberts or364
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 16
Prewitt edge detectors, and to real-time edge detection at 25 fps, respectively. The evaluation of365
the Gaussian pyramid with these chips would certainly give different metric values, and it would366
require additional hardware to switch between octaves. The chip in [44] performs background367
subtraction with two digitally-programmable switched-capacitor low-pass filters per pixel. The368
energy overhead on our chip when compared to the chips in Table IV is partly explained by the369
higher complexity of the function that it runs. Differences in fill-factor and pixel pitch are also370
due to the larger complexity of our PE. Particularly, our chip and that in [6] embed an 8-bit371
single-slope A/D converter. Nevertheless, while [6] follows a per-column ADC architecture, our372
chip follows a per-pixel one to achieve full paralellism and hence large speed.373
D. Application Assessment374
The accuracy of the on-chip Gaussian pyramid has been assessed by incorporating hardware375
errors into the interactive tool reported in [45]. This tool employs the SIFT feature detector to376
perform visual tracking of six 2D textures on VGA-resolution videos. Visual tracking metrics377
are calculated along the application of homography, defined as the matrix that captures the378
transformation of the 2D textures from one frame to the next one; e.g. rotation.379
Repeatability (RP ) is the metric that we have calculated to assess the quality of visual tracking380
with the on-chip Gaussian pyramid [45]. As defined in [45], and formulated in Eq. (7), below,381
RP is the set of interest points Sj−1 and Sj−2 at frames j−1 and j−2 such that the geometrical382
distance between them after applying the corresponding homographies (Hj−1 and Hj−2) from383
frames j − 1 and j − 2 to frame j are below a certain threshold normalized to the total number384
of interest points Sj−1 or Sj−2. RP gives an estimate of the percentage of interest points whose385
allocation in successive frames is successfully forecast with the extracted homography.386
RP =
|(xa ∈ Sj−2, xb ∈ Sj−1)| ||Hj−2 · xa −Hj−1 · xb| | < 
|Sj−1| (7)
The RMSE values measured from the chip have been expressed as per-pixel local errors by387
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 17
finding the standard deviation of the normal distribution which corresponds to the given RMSE388
level. The normal distribution conveys the variability from chip manufacturing. These errors have389
been added to every scale of the Gaussian pyramid. Fig. 17 displays RP vs RMSE for RMSE of390
0%, 1%, 2.5% and 5%. Our on-chip RMSE levels are below 1.2% of FSO. RP is the average of391
the aforementioned six 2D textures throughout all the frames of the corresponding videos with392
three different image transformations, namely, rotation, zoom and perspective distortion. The error393
bars, calculated as the standard deviation throughout the averaged data, reports RP degradations394
which are tolerable for most applications. In fact, as reported in [45], the temporal distance395
between consecutive frames has a larger impact on RP . In this regards, the large Gaussian396
pyramid calculation throughput of our chip becomes an important asset as it enables to reduce397
the baseline distance between consecutive frames.398
V. CONCLUSION399
This paper presents a proof-of-concept CVIS of 176 × 120 pixels for the parallel computation400
of the Gaussian pyramid with a double-Euler SC networks. Cutting PE area through smaller401
state capacitors of the SC network might the be the most straightforward way to upscale our402
architecture while keeping performance metrics. Eventually, a given resolution could not be met403
with a double-Euler SC network. In that case, resorting to a simple-Euler network might be a404
solution if the loss of accuracy is affordable for the targeted application framework. Measurements405
from our chip demonstrate that sensory-processing architectures with per-pixel mixed-signal406
processors outperform conventional architectures consisting of an imager and an MPU in terms407
of both energy consumption and throughput. Our results also show that unavoidable errors of408
the analog circuitry do not result into unfeasible Gaussian pyramids as it has been verified by409
visual tracking metrics with a publicly available image dataset. The main limitations posed by410
the type of SIMD-CVIS reported in this paper are direct consequences of the use of per-pixel411
circuitry and standard, planar technologies, namely: i) enlarged pixel pitch; and ii) reduced fill-412
factors. The former might constrain the use of this type of chips to applications where the object413
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 18
of interest is at a short distance to the camera. The latter calls mainly for applications with414
controlled illumination conditions. However, these limitations can be overcome by re-targetting415
our architecture into 3D vertically-integrated technologies, a task for which the circuits and416
methods reported in this paper can be re-used.417
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 19
REFERENCES418
[1] B. Roska and F. Werblin, ”Vertical Interactions Across Ten Parallel, Stacked Representations in the Mammalian Retina,”419
Nature, 410, pp. 583-587, 2001.420
[2] T. Gollisch and M. Meister, ”Eye Smarter than Scientists Believed: Neural Computations in Circuits of the Retina”, Neuron421
Review, vol. 65, pp. 150-164, January 2010.422
[3] C.L. Lee and C.C. Hsieh, ”A 0.8-V 4096-Pixel CMOS Sense-and-Stimulus Imager for Retinal Prosthesis,” IEEE Transactions423
on Electron Devices, vol. 60, no. 3, 1162-1168, 2013.424
[4] J. Ferna´ndez-Berni et al., ”FLIP-Q: A QCIF Resolution Focal-Plane Array for Low-Power Image Processing,” IEEE Journal425
of Solid-State Circuits, vol. 46, no. 3, pp. 669-680, March 2011.426
[5] S.J. Carey et al., ”A 100,000 fps Vision Sensor with Embedded 535GOPS/W 256 x 256 SIMD Processor Array,” 2013427
Symposium on VLSI Circuits (VLSIC), pp. C182-C183, 2013.428
[6] S. Park et al., ”243.3 pJ/Pixel Bio-Inspired Time-Stamp-Based 2D Optic Flow Sensor for Artificial Compound Eyes,” 2014429
IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), pp. 126-127, 2014.430
[7] A. Rodrı´guez-Va´zquez et al., ”ACE16k: The Third Generation of Mixed-Signal SIMD-CNN ACE Chips Toward VSoCs”,431
IEEE Transactions on Circuits and Systems-I, vol. 51, no. 5, pp. 851-863, May 2004.432
[8] Anafocus Ltd. [Online]. Available. http://www.anafocus.com.433
[9] R. Carmona-Gala´n et al., ”A Hierarchical Vision Processing Architecture Oriented to 3D Integration of Smart Camera434
Chips,” Journal of Systems Architecture, vol. 59, no. 10, Part A, pp. 908-919, 2013.435
[10] T. Roska and A. Rodrı´guez-Va´zquez, ”Towards the Analogic Visual Microprocessor”, ISBN: 0-471-95606-6, John Wiley436
& Sons, Chichester, 2001.437
[11] V. Busskamp and B. Roska, ”Optogenetic Approaches to Restoring Visual Function in Retinitis Pigmentosa”, Current438
Opinion in Neurobiology, vol. 21, pp. 942-946, 2011.439
[12] A. Torralba, ”How Many Pixels Make an Image?”, Visual Neuroscience, vol. 26, n. 01, pp. 123-131.440
[13] M. Eldib, ”Behavior Analysis for Elderly Care Using a Network of Low-Resolution Visual Sensors”, Journal of Electronic441
Imaging, vol. 25, no. 4, pp. 041003-041003, 2016.442
[14] Toshiba-Teli. http://www.toshiba-teli.co.jp/en/products/industrial/sps/sps.htm.443
[15] A. Blug et al., ”Closed-Loop Control of Laser Power Using the Full Penetration Hole Image Feature in Aluminum Welding444
Processes”, Physics Procedia, vol. 12, pp. 720-729, 2011.445
[16] E.R. Davies, ”Machine Vision: Theory, Algorithms, Practicalities”, Elsevier, 2004.446
[17] P. Viola and M.J.A. Jones, ”Robust Real-Time Face Detection,” International Journal of Computer Vision, vol. 57, no. 2,447
pp. 137-154, 2004.448
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 20
[18] T. Tuytelaars and K. Mikolajczyk. ”Local invariant feature detectors: a survey,” Foundations and Trends in Computer449
Graphics and Vision, vol. 3, no. 3 pp. 177-280, 2008.450
[19] J. Ferna´ndez-Berni et al., ”Bottom-up performance analysis of focal-plane mixed-signal hardware for Viola-Jones early451
vision tasks,” International Journal of Circuit Theory and Applications, vol. 43, no. 8, pp. 1063-1079, 2015.452
[20] A´. Zara´ndy, ”Focal-Plane Sensor-Processor Chip,” Springer: Berlin, Germany, 2011.453
[21] D. Lowe, ”Distinctive Image Features from Scale-Invariant Keypoints,” International Journal of Computer Vision, vol.454
60(2): 91-110, 2004.455
[22] J. Ferna´ndez-Berni, R. Carmona-Gala´n, and A. Rodrı´guez-Va´zquez, ”Low-Power Smart Imagers for Vision-Enabled Sensor456
Networks,” Springer Science & Business Media, 2012.457
[23] M. Nathan et al. ”The Grasp Multiple Micro-UAV Testbed,” IEEE Robotics & Automation Magazine, vol. 17, no. 3, pp.458
56-65, 2010.459
[24] M. Sua´rez et al., ”A 26.5 nJ/px 2.64 Mpx/s CMOS Vision Sensor for Gaussian Pyramid Extraction,” European Solid State460
Circuits Conference (ESSCIRC) ESSCIRC 2014-40th, pp. 311-314, 2014.461
[25] Jan J. Koenderink and A.J. van Doorn, ”Representation of Local Geometry in the Visual System,” Biological Cybernetics,462
vol. 50, no.5, pp. 363-370, 1984.463
[26] T. Lindeberg, ”Scale-Space for Discrete Signals,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.464
12, no . 3, pp. 234-254, Mar. 1990.465
[27] P. Geissler, H. Haussecker, and B. Ja¨hne. ”Handbook of Computer Vision and Applications,” Vol. 2, Signal Processing466
and Pattern Recognition. Academic Press, 1999.467
[28] C.B. Umminger and C.G. Sodini. ”Switched Capacitor Networks for Focal Plane Image Processing Systems,” IEEE468
Transactions on Circuits and Systems for Video Technology, vol. 2, no. 4, pp. 392-400, 1992.469
[29] M. Sua´rez et al., ”Switched-Capacitor Networks for Scale-Space Generation,” 2011 20th European Conference on Circuit470
Theory and Design (ECCTD), pp.189-192, August 29-31, 2011.471
[30] M. Sua´rez et al., ”CMOS-3D Smart Imager Architectures for Feature Detection,” IEEE Journal on Emerging and Selected472
Topics in Circuits and Systems, vol. 2, no. 4, pp. 723-736, Dec. 2012.473
[31] M. Sua´rez et al., ”Three Dimensional CMOS Image Processor for Feature Detection,” Patent No.: US 8,942,481 B2, Jan.474
27, 2015.475
[32] David A. Johns and Ken Martin, ”Analog Integrated Circuit Design”, John Wiley & Sons. Inc., 1997.476
[33] A. El Gammal and H. Eltoukhy, ”CMOS Image Sensors”, IEEE Circuits and Devices Magazine, pp. 6-20, May/June 2005.477
[34] Y.M. Chi et al.,”CMOS Camera with In-Pixel Temporal Change Detection and ADC,” IEEE Journal of Solid-State Circuits,478
vol. 42, no. 10, pp. 2187-2196, Oct. 2007.479
[35] M. Sua´rez et al., ”In-Pixel ADC for a Vision Architecture on CMOS-3D Technology,” 2010 IEEE International 3D Systems480
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 21
Integration Conference (3DIC), pp. 1-7, 16-18 Nov. 2010.481
[36] Terasic. [Online]. Available. http://www.terasic.com.tw/.482
[37] Raspberry. [Online]. Available.http://www.raspberrypi.org/.483
[38] Omnivision. [Online]. Available. http://www.ovt.com/.484
[39] M. Murphy et al., ”Image Feature Extraction for Mobile Processors”. 2009 IEEE International Symposium on Workload485
Characterization, pp. 138-147, 2009.486
[40] Feng-Cheng Huang et al., ”High-Performance SIFT Hardware Accelerator for Real-Time Image Feature Extraction,” IEEE487
Transactions on Circuits and Systems for Video Technology, vol. 22, no. 2, pp. 340-351, March 2012.488
[41] G. Wang et al., ”Workload Analysis and Efficient OpenCL-based Implementation of SIFT Algorithm on a Smartphone,”489
2013 IEEE Global Conference on Signal and Information Processing, pp. 759-762, 2013.490
[42] W. Jendernalik et al., ”An Analog Sub-Miliwatt CMOS Image Sensor with Pixel-Level Convolution Processing,” IEEE491
Transactions on Circuits and Systems-I: Regular Papers, vol. 60, no. 2, pp. 279-289, 2013.492
[43] P. Dudek and P.J. Hicks, ”A General-Purpose Processor-per-Pixel Analog SIMD Vision Chip, IEEE Transactions on493
Circuits and Systems-I, vol. 52, no. 1, pp. 13-20, January 2005.494
[44] N. Cottini et al., ”A 33 µW 64 × 64 Pixel Vision Sensor Embedding Robust Dynamic Background Subtraction for Event495
Detection and Scene Interpretation,” IEEE Journal of Solid-State Circuits, vol. 48, no. 3, pp. 850-863, March 2013.496
[45] S. Gauglitz et al., ”Evaluation of Interest Point Detectors and Feature Descriptors for Visual Tracking,” Int. J. of Computer497
Vision, vol. 94, pp. 335-360, 2011.498
M. Sua´rez graduated in Physics in 2008 and received the Ph.D. in 2015 in the field of machine vision499
systems at the University of Santiago de Compostela, Spain. The research was done at Centro Singular500
de Investigacio´n en Tecnoloxı´as da Informacio´n (CiTIUS) in collaboration with the Institute of Microelec-501
tronics of Seville (IMSE-CNM-CSIC, Spain) through several stays along the PhD. He is the first inventor502
of an USPTO patent and received the 3rd Best Student Paper award at ECCTD 2013. Currently, he works503
as analog designer for Atomos GmbH in Germany.504
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 22
V.M.Brea received his PhD in Physics in 2003 with honors. Dr. Brea has served in technical and steering505
committees of international conferences as ICDSC and CNNA. He has been awarded with the best and506
the third best student papers at ECCTD 2003 and ECCTD 2013, respectively. He has authored/coauthored507
around 100 papers in refereed journals, conferences and workshops. Currently he is an Associate Professor508
at Centro Singular de Investigacio´n en Tecnoloxı´as da Informacio´n (CiTIUS), University of Santiago de509
Compostela, Spain. His main research interests lie in the design of hardware architectures and CMOS solutions for computer510
vision, especially in early vision, as well as micro-energy harvesting.511
J. Ferna´ndez-Berni received a B. Eng. degree in Electronics and Telecommunication in September 2004,512
a M.Sc. degree in Microelectronics in December 2008 and his Ph.D. in June 2011 with honors, from513
the University of Seville, Spain. From January 2005 through September 2006, he was working in the514
Telecommunication Industry. He has been a visiting researcher at the Computer and Automation Research515
Institute (Budapest, Hungary), Ghent University (Ghent, Belgium) and the University of Notre-Dame (IN,516
USA). Dr. Ferna´ndez-Berni has authored/co-authored some 50 papers in refereed journals, conferences and workshops. He is also517
the first author of a book and two book chapters as well as the first inventor of two licensed patents. He received the Best Paper518
Award in ”Image Sensors and Imaging Systems, SPIE Electronic Imaging 2014, San Francisco CA, USA” and the Third Prize519
of the Student Paper Award in ”IEEE CNNA 2010: 12th Int. Workshop on Cellular Nanoscale Networks and their Applications,520
Berkeley CA, USA”. His main areas of interest are smart image sensors, vision chips and embedded vision systems.521
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 23
R. Carmona-Gala´n (M’04)(SM’16) graduated in Physics and got a Ph.D. in Microelectronics from the522
University of Seville, Spain. He worked as a Research Assistant at Prof. Chua’s laboratory in the EECS523
Department of the University of California, Berkeley. He has been Assistant Professor of the Department524
of Electronics of the University of Seville. Since 2005, he is a Tenured Scientist at the Institute of525
Microelectronics of Seville (IMSE-CNM-CSIC). His main research focus has been on VLSI implementation526
of concurrent sensor/processor arrays for real time image processing and vision. He also held a Postdoc at the University of527
Notre Dame, Indiana (2006 - 2007), where he worked in interfaces for CMOS compatible nanostructures for multispectral light528
sensing. He has collaborated with start-up companies in Seville (Anafocus) and Berkeley (Eutecus). He has designed several529
vision chips implementing different focal plane operators for early vision processing. His current research interests lie in the530
design of low-power smart image sensors and 3-D integrated circuits for autonomous vision systems. He has coauthored more531
than 120 jorunal and conference papers and a book on low-power vision sensors for vision-enabled sensor networks. He is532
co-inventor of several patents. Ricardo Carmona-Gala´n is a Senior Member of the IEEE. He has been associate editor for IEEE533
TCAS-I and now is for Springer’s Journal on Real-Time Image Processing. He got a Certificate of Teaching Excellence from534
the University of Seville. Very recently, he received the Best Paper Award of the IEEE-CASS Technical Committee on Sensory535
Systems at ISCAS 2015, together with Dr. Vornicu and Prof. Rodrı´guez-Va´zquez.536
D. Cabello (M’96) received the BSc and PhD degreesin Physics from the University of Granada, Granada,537
Spain, and the University of Santiago de Compostela, Santiago de Compostela, Spain, in 1978 and 1984,538
respectively. Currently, he is a Professor of Electronics at Centro Singular de Investigacio´n en Tecnoloxı´as539
da Informacio´n (CiTIUS), University of Santiago de Compostela, Spain. He has been the Dean in the540
Faculty of Physics between 1997 and 2002, and the Head of the Department of Electronics and Computer541
Science between 2002 and 2006, both in the University of Santiago de Compostela. His main research interests lie in the design542
of efficient architectures and CMOS solutions for computer vision, especially in early vision.543
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 24
A´ngel Rodrı´guez-Va´zquez (F’96) (IEEE Fellow, 1999) received undergraduate and PhD degrees in544
Physics-Electronics with several national and international awards, including an IEEE award. After different545
research stays in University of California-Berkeley and Texas A&M University he became a Full Professor546
of Electronics at the University of Sevilla in 1995. He co-founded the Institute of Microelectronics of547
Sevilla, under the umbrella of the Spanish Council Research (CSIC) and the University of Sevilla and548
started a research group on Analog and Mixed-Signal Circuits for Sensors and Communications. In 2001 he was the main549
promotor and co-founder of the start-up company AnaFocus Ltd and served as CEO, on leave from the University, until June 2009,550
when the company reached maturity as a worldwide provider of smart CMOS imagers and vision systems-on-chip. His research551
is on the design of analog and mixed-signal front-ends for sensing and communication, including smart imagers, vision chips552
and low-power sensory-processing microsystems. He has authored 11 books, 36 additional book chapters, and some 150 journal553
articles in peer-review specialized publications. He has presented invited plenary lectures at different international conferences554
and has received a number of awards for his research (the IEEE Guillemin-Cauer best paper award, two Wiley s IJCTA best555
paper awards, two IEEE ECCTD best paper award, one SPIE-IST Electronic Imaging best paper award, the IEEE ISCAS best556
demo-paper award and the IEEE ICECS best demo-paper award). He was elected Fellow of the IEEE for his contributions to the557
design of chaos-based communication chips and neuro-fuzzy chips. His research work got some 6,700 citations;he has an h-index558
of 43 and an i10-index of 133. He has always been looking for the balance between long-term research and innovative industrial559
developments. AnaFocus Ltd. was founded on the basis of his patents on vision chips and he participated in the foundation of560
the Hungarian start-up company AnaLogic Ltd. He has Eight Patents filed, three of which have been licensed to companies.561
He has served as Editor, Associate Editor and Guest Editor for different IEEE and non-IEEE journals, is in the committee of562
several international journals and conferences, and has chaired several international IEEE and SPIE conferences. He served as VP563
Region 8 of the IEEE Circuits and Systems Society (2009-2012) and as Chair of the IEEE CASS Fellow Evaluation Committee564
(2010, 2012, 2013, 2014 and 2015).565
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 25
Image
Octave 2
Octave 1
Octave 3
Octave 4
Octave 5
σ0
σ1=kσ0
σ2=kσ1
σ3=kσ2
σ5=kσ4
σ4=kσ3
σ0
σ1=kσ0
σ2=kσ1
σ3=kσ2
...
MxN
σ4=kσ3
σ5=kσ4
σ0
σ1=kσ0
σ2=kσ1
σ3=kσ2
σ4=kσ3
σ5=kσ4
(MxN)/4
(MxN)/16
Fig. 1. Scale-space through the Gaussian pyramid with octaves and scales. Each octave has 1/4 the spatial resolution of the
previous one, starting from the bottom. Thus, if the initial image has M×N pixels, images in the second octave have (M×N )/4
and so forth.
C1 C2
V1 V2
(a)
C1 C2
V1 V2V
CE
E
(b)
O1 O2
C1 C2
V1 V2
V
CE1
E1
C
E2
VE2
(c)
O1O2
O1 O2
O1
O2
1 cycle
1/2
(d)
Fig. 2. Topologies for Gaussian filtering in 1D; (a) an RC network, (b) and (c) simple- and double-Euler SC networks,
respectively. (d) non-overlapping control signals for SC networks.
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 26
Capacitors Photodiodes
Fig. 3. Chip micrograph with dimensions (in mm) and a close-up of the PEs.
PE(0,0)
CDS (0,0)
A/D (0,0)
CDS (0,1)
A/D (0,1)
PE(0,0)
CDS (0,0)
A/D (0,0)
PE(0,0)
CDS
A/D
PE(0,1)
PE(1,0) PE(1,1)
PE(0,1)
CDS (0,1)
A/D (0,1)
PE(0,0)
CDS (0,0)
A/D (0,0)
PE(0,1)
CDS (0,1)
A/D (0,1)
(a) (b)
(c) (d)
PE(0,1)
Cp00_O1PD00 PD01
PD10 PD11
PD02 PD03
PD12 PD13
Cp01_O1
Cp10_O1 Cp11_O1
Cp02_O1 Cp03_O1
Cp12_O1 Cp13_O1
Cp00_O2 Cp01_O2
Cp00_O3
Fig. 4. PE array configuration across different functions of the chip: (a) image acquisition, where four photodiodes share one
CDS and A/D converter in a PE, (b) first octave, where four state capacitors share one CDS and A/D converter in a PE (c)
second octave, where four state capacitors in a PE are shorted together to perform downscaling, and there is a CDS and A/D
per PE, and (d) third octave, where the state capacitors of 4 PEs are combined into only one to run downscaling.
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 27
nij nij+1
ni+1j ni+1j+1Φ
re
ad
_
ne
t
Φ
re
a
d
_
n
e
t
Gauss. Pyr.
Acquisition A/D and Gaussian Pyramid
px0 px1 px2 px3 px0 px1 px2 x3 px0 px1 px2 px3 px0 px1 px2 x3
...
... px px px px px px px px px px px px
Φ
2
_O
1
Φ
2
_O
1
Φ
2
_O
1
Φ
2
_O
1
Φ2_O1
Φ2_O1 Φ2_O1
Φ2_O1Φ1_O1_pij
Φ1_O1_pi+1j
Φ
1
_O
1
_p
ij
Φ
1
_O
1
_p
i+
1
j
Φ1_O1_pij+1
Φ1_O1_pi+1j+1
Φ
1
_O
1
_p
i+
1
j+
1
Φ
1
_O
1
_p
i+
1
j
C'pij C'pij+1
C'pi+1j C'pi+1j+1
SC
_ACE
CE
CE
CE
SC
_A
SC_A SC_A
Φ2_O1O2
Φ1_pij CE
SC_B Φ2_O1O2
Φ1_pij+1 CE
SC_B
Φ
2
_O
1
O
2
CE
SC_B
Φ2_O1O2
Φ
1
_p
i+
1
j
CE
SC_B Φ2_O1O2
Φ1_pi+1j+1CE
SC_B
Φ
2
_O
1
O
2
Φ1_pi+1j+1
CE
SC_B
Φ
2
_O
1
O
2
Φ
1
_p
i+
1
j
CE
SC_B
Φ2_O1O2 Φ1_pij
CE
SC_B
Φ
1
_p
ij+
1
Acquisition
Fig. 5. PE and its associated time diagram. The PE is made up of four photosensors, four local analog memories (LAMs),
one CDS circuit, one comparator for A/D conversion, and the local circuitry of the double-Euler SC network to build up the
Gaussian pyramid.
Vout
enable_n
enable
(a)
(c)
5
Vin(V)
0
0
I(
u
A
)
(d)
Freq (Hz)
25
(b)
0,35 0,55 0,75 0,95 1,15 1,35
61
62
63
64
65
66
67
68
69
Vin (V)
G
a
in
(d
B
)
Conf. IA
(e)
Vdd
Vbp
Vcp
Vcn
Vin
1/0.18
1/0.28
0.6/1
0.6/0.8
0.6/0.24
1/0.3
Fig. 6. Amplifier topologies used in the CDS, LAMs and comparator circuits of Fig.5 with some of their characteristics. (a) and
(b) cascode configurations IA and IB. (c) gain versus input voltage. (d) current consumption vs input voltage in configurations
IA and IB. (e) frequency response of configuration IA within the range of operation, [0.4, 1.3] V.
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 28
(a ) (b )
Φrdm
Φvref
Φwritep
t0 t1
Φrst
Φr_pij
Φ1_O1_pij
Vref
Φen_inv1
Vpij(t)'
Vpij(t)
Voutij(t)
Vdd
Φrst
Φr_pij
p
x i
j
bias
Vpij(t)'
Φacq
Vdd
Vpij(t)
C
A1
ee
Φen_inv1
Voutij(t)
Vref
Φrdm
Cpij
Φvref
Φwritep
Φen_inv1n
Φrw_pij
Φwrite0
CE
CE
CE
CE
Φ1_pij
Φ1_O1_pij
Φacq Φacq
Φwrite0
Φvref_cds
nij
Fig. 7. Image acquisition through CDS on our chip. Signal φrw pij selects the 3T-APS pixel associated with the position ij in
the PE. Signal φrw pij is high during the acquisition time for a pixel ij.
Vnij
A1
Φrdm
Φwritep
Φwrite0
Cpij
Voutij
A1
Cpij
VQ
Phase 1
A1
Cpij
Voutij
Phase 2
ee
Φrw_pij
Vnij
Fig. 8. LAMs working in two phases to store and read out scales across the Gaussian pyramid.
Vramp
CVoutij EoC
Buﬀer
ee
Vdd Vdd
0
1
0
1
0
0
1
1
1
1
1
0
0
0
0
1
ee
comp_rsto
compo
ramp_reado
writeo
en_compo
writeo
comp_rsto
Vramp
VOL
VOH
Delay1 Delay2 Delay3
Voutij
EoC
Vin
time
comp_rst
o
write
o
ramp_read
o
o
comp
Vout1
Vout2
M13 M14
o c
om
p
V
o
u
t1
V
o
u
t2
Eo
C
A2 A3
M15
M16
M17
M18
M7
M6
Fig. 9. Comparator of the in-PE 8-bit single-slope A/D converter with the time diagram of its control signals.
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 29
25
50
75
100
25
50
75
100
50
100
150
25
50
100
75
C
h
a
rg
e
(p
C
)
C
h
a
rg
e
(p
C
)(f
C
)
(f
C
)
0 0
Time (us)
25 50 75 1000 12525 50 75 1000 125
(b) (c)
Time (us)
Time (us)
(a)
(u
A
)
I
V
Vout1 w feedback
Vout1 w/o feedback
Vramp
A1 current
w/o feedback
A1 current
w feedback
w/o feedback
w feedback
A1 w/o feedback
A1 w feedback
A2 w/o feedback
A2 w feedback
w/o feedback
w feedback
A2 w feedback
A1 w/o feedback
A1 w feedback
A2 w/o feedback
Voutij
Fig. 10. Different waveforms of the in-PE comparator of Fig. 9 with (w) and without (w/o) feedback loop. (a) currents and
voltages. (b) and (c) display currents integrated for input codes 250 and 40, respectively.
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 30
(a)
α α
α β
SC_B
SC_B
SC_B
SC_B
S
C
_
B
S
C
_
B
S
C
_
B
S
C
_
B
SC_B
SC_B
SC_B
SC_B
S
C
_
B
S
C
_
B
S
C
_
B
S
C
_
B
SC_B
SC_B
SC_B
SC_B
S
C
_
B
S
C
_
B
S
C
_
B
S
C
_
B
SC_A
SC_A
S
C
_
A
S
C
_
A
SC_C
SC_C
SC_C
SC_C
S
C
_
C
S
C
_
C
S
C
_
C
S
C
_
C
(b) (c)
Φ2_o1o2 Φ1_pij
CE
SC_B
(d)
Φ2_o3
Φ'2_o1o2 Φ'1_pij
CE
SC_C
CE
CE
Φ 2_o1
SC_A
Φ 2_o1
Φ 1_O1_pij
Φ 1_O1_pij
Fig. 11. (a) Double-Euler SC network for a grid of 4 × 4 pixels (2 × 2 PEs) of the chip. (b), (c) and (d) show the internal
structure of blocks SCA, SCB and SCC . Groups of 2× 2 PEs comprise PEs of α and β type.
Φ1_O1_pij S
Φ2_O1 S
Φ1_pij S
Φ2_O1O2 S
Φ'1_pij S
Φ'2_O1O2 S
Initialization Scale
1
s
t
O
c
t.
Φ1_O1_pij H
Φ1_pij H
Φ'1_pij H
Downscaling
Φ1_O1_pij H
Φ2_O1 H
2
n
d
O
c
t. Φ1_pij H
Φ'1_pij H
Φ1_pij S
Φ2_O1O2 S
Φ'1_pij S
Φ'2_O1O2 S
Φ2_O1O2 H,L
Φ1_pij H,L
3
rd
O
c
t.
Φ'2_O1O2 H Φ'1_pij S
Φ3_O2 S
Fig. 12. State of control signals across the Gaussian pyramid. Symbols H and L refer to high and low states. H,L means that
first the signal goes high, and subsequently low. All the former states are found during initialization or downscaling to change
between octaves. Symbol S means that the signals are switching to generate the scales of the pyramid.
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 31
352x15
8bit register
Comparison Buﬀer
Tristate Buﬀer
Column Decoder
R
o
w
B
u
ﬀ
e
r
R
o
w
D
e
co
d
e
r
..
.
..
.
bk
...
..
.
..
.
SRowi
SRowi
S
C
o
lj
S
C
o
lj
SRowi
SRowi
EoC
..
.
...
..
.
...
SRowi
R
o
w
D
e
c
o
d
e
r
Reg_select
x8
...
digo
...
x8
x8
..
.
P
E
a
rr
a
y
R
e
g
.
B
a
n
k
1
8 9 0 1
..
R
e
g
.
B
a
n
k
0
0
30
44
A
30
44
B
45
59
A
45
59
B
1 2 3
...
...
0
0
14
0
14
15
29
15
29
1 2 3
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
4
30
44
A
30
44
B
45
59
A
45
59
B
5 6 7
...
...
4
0
14
0
14
15
29
15
29
5 6 7
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
PE (30,0)
PE (59,0)
PE (44,0)
PE (45,0)
PE (30,1)
PE (59,1)
PE (44,1)
PE (45,1)
0 0 0 0 1 1 1 1
0 0 0 0 1 1 1 1
A B A B A B A B
(a)
(b)
(c)
Fig. 13. 1/2 frame buffer bank for the single-slope A/D converter of half of the array is shown in (a). The PE-registers
assignment is displayed on (b). The register circuitry can be seen in (c). EoC comes from the in-PE comparator. bk is a bit of
the digital word issued by an 8-bit global counter.
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 32
Fig. 14. Prototype camera module to extract the on-chip Gaussian pyramid.
(a)
(b)
Fig. 15. On-chip σSC vs clock cycles n in the first and second octaves of the Gaussian pyramid.
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 33
Fig. 16. Image acquisition and different snapshots of the on-chip Gaussian pyramid across the first octave. The upper left image
is the input scene, the rest of the images from left to right and top to bottom correspond to σ=1,77 (clock cyles n=19), σ=2,17
(n=29), and σ=2,51 (n=39).
0 2 4
0
0.2
0.4
0.6
Fig. 17. Repeatability as a function of RMSE for three image transformations, namely, (a) rotation, (b) zoom and (c) perspective
distortion.
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 34
TABLE I. PE TRANSISTOR SIZES (IN MICRONS).
Width Length Width Length
Photodiode 7.4 6.7 M1 0.24 1
M2 1.6 0.3 M3 0.24 0.6
M4 0.6 0.8 M5 0.24 1.4
M6 0.24 0.8 M7 0.24 1
M8 0.24 0.3 M9 0.24 0.2
M10 0.24 0.8 M11 0.24 0.2
M12 0.24 0.4
TABLE II. COMPARATOR TRANSISTOR SIZES (IN MICRONS).
W L W L W L
M13 0.24 0.4 M14 0.24 0.8 M15 0.24 0.2
M16 2 0.2 M17 1.5 0.2 M18 0.24 0.2
TABLE III. COMPARISON OF OUR CHIP WITH CONVENTIONAL SOLUTIONS
HW Solution Func. Energy/frame En./px Mpx/s Mpx/J Mpx/s.mm2
(µJ/px)
This work Gauss. 176 × 120 resol. 0.027 2.64 37.7 0.11
180 nm CMOS Pyr. 70 mW @ 8 ms
0.56 mJ/frame
Ref. [39] Gauss. VGA resol. 15.5 2.26 0.064 0.007
OV9655 + Pyr. 90 mW @ 30 fps
Core-i7 +
35 W @ 136 ms
4.8 J/frame
Ref. [40] Gauss. VGA resolution 240 0.15 0.004 0.001
OV9655 + Pyr. 90 mW + 35 W
Core-2-Duo @ 2.1 s
73.7 J/frame
Ref. [41] Gauss. 350 × 256 resol. 4.4 0.91 0.23 –
OV6922 + Pyr. 30 mW + 4 W
Qualcomm @ 98.5 ms
Snapdragon S4 0.4 J/frame
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 35
TABLE IV. COMPARISON OF OUR CHIP WITH OTHER STATE-OF-THE-ART CVIS
HW Sol. This work Ref. [6] Ref. [42] Ref. [43] Ref. [44]
Funct. Gauss. 2D Optic 3×3 General Back.
Pyr. Flow Est. Conv. Purpose Subt.
w A/D w A/D Low-level
(SS 8 bits) (SS 8 bits) Imag.-Proc.
Tech. & Res. 0.18 µm 0.18 µm 0.35 µm 0.6 µm 0.35 µm
176 × 120 px. 64 × 64 px. 64 × 64 px. 21 × 21 px. 64 × 64 px.
Fill-Fact. 10.25% 18.32% 23% 8.4% 12%
Pixel-Pitch 44 µm 28.8 µm 35 µm 98.6 µm 26 µm
En./px 26.5 nJ/px 0.89 nJ/px 0.19 nJ/px 0.52 nJ/px 0.62 nJ/px
Throughput 2.64 Mpx/s 0.49 Mpx/s 0.1 Mpx/s 0.11 Mpx/s 0.053 Mpx/s
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 36
Fig. 1. Scale-space through the Gaussian pyramid with octaves and scales. Each octave has566
1/4 the spatial resolution of the previous one, starting from the bottom. Thus, if the initial image567
has M ×N pixels, images in the second octave have (M ×N )/4 and so forth.568
Fig. 2. Topologies for Gaussian filtering in 1D; (a) an RC network, (b) and (c) simple- and569
double-Euler SC networks, respectively. (d) non-overlapping control signals for SC networks.570
Fig. 3. Chip micrograph with dimensions (in mm) and a close-up of the PEs.571
Fig. 4. PE array configuration across different functions of the chip: (a) image acquisition,572
where four photodiodes share one CDS and A/D converter in a PE, (b) first octave, where four573
state capacitors share one CDS and A/D converter in a PE (c) second octave, where four state574
capacitors in a PE are shorted together to perform downscaling, and there is a CDS and A/D per575
PE, and (d) third octave, where the state capacitors of 4 PEs are combined into only one to run576
downscaling.577
Fig. 5. PE and its associated time diagram. The PE is made up of four photosensors, four local578
analog memories (LAMs), one CDS circuit, one comparator for A/D conversion, and the local579
circuitry of the double-Euler SC network to build up the Gaussian pyramid.580
Fig. 6. Amplifier topologies used in the CDS, LAMs and comparator circuits of Fig.5 with581
some of their characteristics. (a) and (b) cascode configurations IA and IB. (c) gain versus input582
voltage. (d) current consumption vs input voltage in configurations IA and IB. (e) frequency583
response of configuration IA within the range of operation, [0.4, 1.3] V.584
Fig. 7. Image acquisition through CDS in our chip. Signal φrw pij selects the 3T-APS pixel585
associated with the position ij in the PE. Signal φrw pij is high during the acquisition time for586
a pixel ij.587
Fig. 8. LAMs working in two phases to store and read out scales across the Gaussian pyramid.588
Fig. 9. Comparator of the in-PE 8-bit single-slope A/D converter with the time diagram of its589
control signals.590
JOURNAL OF SOLID-STATE CIRCUITS, VOL. 6, NO. 1, JANUARY 2007 37
Fig. 10. Different waveforms of the in-PE comparator of Fig. 9 with (w) and without (w/o)591
feedback loop. (a) currents and voltages. (b) and (c) display currents integrated for input codes592
250 and 40, respectively.593
Fig. 11. (a) Double-Euler SC network for a grid of 4× 4 pixels (2× 2 PEs) of the chip. (b),594
(c) and (d) show the internal structure of blocks SCA, SCB and SCC . Groups of 2 × 2 PEs595
comprise PEs of α and β type.596
Fig. 12. State of control signals across the Gaussian pyramid.597
Fig. 13. 1/2 frame buffer bank for the single-slope A/D converter of half of the array is shown598
in (a). The PE-registers assignment is displayed on (b). The register circuitry can be seen in (c).599
EoC comes from the in-PE comparator. bk is a bit of the digital word issued by an 8-bit global600
counter.601
Fig. 14. Prototype camera module to extract the on-chip Gaussian pyramid.602
Fig. 15. On-chip σSC vs clock cycles n in the first and second octaves of the Gaussian pyramid.603
Fig. 16. Image acquisition and different snapshots of the on-chip Gaussian pyramid across the604
first octave. Image acquisition and different snapshots of the on-chip Gaussian pyramid across605
the first octave. (a) Input scene. (b) σ=1,77 (clock cyles n=19). (c) σ=2,17 (n=29). (d) σ=2,51606
(n=39).607
Fig. 17. Repeatability as a function of RMSE for three image transformations, namely, (a)608
rotation, (b) zoom and (c) perspective distortion.609
Table I. PE Transistor Sizes (in Microns).610
Table II. Comparator Transistor Sizes (in Microns).611
Table III. Comparison of Our Chip with Conventional Solutions.612
Table IV. Comparison of Our Chip with Other State-of-the-Art CVIS.613
