Live Demonstration: Gaussian Pyramid Extraction with a CMOS Vision Sensor by Suárez Cambre, Manuel et al.
Live Demonstration: Gaussian Pyramid Extraction
with a CMOS Vision Sensor
M. Suárez∗, V.M. Brea∗, J. Fernández-Berni†‡, R. Carmona-Galán†, D. Cabello∗ and A. Rodrı́guez-Vázquez†‡
∗Centro de Investigación en Tecnoloxı́as da Información (CITIUS)
University of Santiago de Compostela, Santiago de Compostela, Spain
Email: victor.brea@usc.es
†University of Seville, Instituto de Microelectrónica de Sevilla (IMSE-CNM), Seville, Spain
‡CSIC, Instituto de Microelectrónica de Sevilla (IMSE-CNM), Seville, Spain
Abstract—This live demonstration showcases the Gaussian
pyramid with a CMOS vision sensor. The chip features a 176
× 120 pixel array in standard 0.18 µm CMOS technology.
The sensing elements are designed as 3-Transistor Active Pixel
Sensors (3T-APS) with in-pixel ADC and CDS. The Gaussian
pyramid is extracted concurrently with a double-Euler switched-
capacitor network on the same substrate, giving RMSE errors
below 1.2% of FSO. The chip provides a Gaussian pyramid of
3 octaves with 6 scales each with an energy cost of 26.5 nJ/px at
2.64 Mpx/s.
I. INTRODUCTION
Gaussian pyramid provides feature detectors with the ability
to give the same response regardless the distance of the object
to the camera [1]. The construction of the Gaussian pyramid
comprises several downscalings of the input scene, the so-
called octaves. In so doing, octave Oi is the 1/4 downscaling
of the former octave Oi−1. Every octave is a set of images
called scales which are the result of applying Gaussian kernels
with increasing widths. The Gaussian pyramid generation is
a very time-consuming task, which, as reported in [2], might
take up to 90% of the computing time of a feature detector.
II. CHIP FEATURES
This live demonstration presents a fast and power-efficient
CMOS vision sensor chip with concurrent image acquisition
and Gaussian pyramid extraction that comprises an array of
176 × 120 3T-APS in standard 0.18 µm CMOS technology
within an area of 5 × 5 mm2. Photodiodes and processing
circuitry are arranged in Processing Elements (PE) of 44 ×
44 µm2. Every PE contains 4 3T-APS, per-PE ADC and CDS
circuitry, and the circuits of a double-Euler switched-capacitor
network. The chip consumes 70 mW for scene acquisition and
the extraction of a Gaussian pyramid of 3 octaves and 6 scales
each. The Gaussian pyramid takes 8 ms (ADC included). This
renders 26.5 nJ/px at 2.64 Mpx/s. The Gaussian pyramid is
provided with less than 1.2% FSO error when compared to a
software solution. Interested readers can probe references [3],
[4], [5] for further details of the chip.
III. LIVE DEMONSTRATION SETUP
Fig. 1 shows a picture of the live demonstration setup. The
system comprises the chip in a PGA120 package, the lens with
a focal distance of 35 mm, f1/4 as focal and mount C type, a
Fig. 1. Live demonstration setup.
carrier board of 15 × 6 cm2, a DE0 Terasic FPGA to provide
control signals for the chip, and a Raspberry-Pi with an ARM
processor for visualization purposes.
IV. VISITOR EXPERIENCE
Real-time tests with the chip setup will be conducted during
the conference. Visitors will interact with the system and see
different scales across the Gaussian pyramid.
ACKNOWLEDGMENT
This work has been funded by ONR N000141410355 and Spanish
government projects TEC2012-38921-C02 MINECO (FEDER), IPT-
2011-1625-430000 MINECO, IPC-20111009 CDTI (FEDER), Junta
de Andalucı́a TIC 2338-2013, EM2013/038 (FEDER), EM2014/012,
AE CITIUS (CN2012/151, (FEDER)), and GPC2013/040 (FEDER).
REFERENCES
[1] D. Lowe, ”Distinctive Image Features from Scale-Invariant Keypoints”.
International Journal of Computer Vision, vol. 60, no. 2, pp. 91-110,
2004.
[2] K. Mizuno et al., ”Fast and Low-Memory-Bandwidth Architecture of
SIFT Descriptor Generation with Scalability on Speed and Accuracy for
VGA Video”, FPL 2010, pp. 608-611, 2010.
[3] M. Suárez et al., ”CMOS-3D Smart Imager Architectures for Feature
Dection”, IEEE Journal on Emerging and Selected Topics in Circuits
and Systems, vol. 2, no. 4, pp. 723-736, Dec. 2012.
[4] M. Suárez et al., ”A 176×120 Pixel CMOS Vision Chip for Gaussian
Filtering with Massivelly Parallel CDS and A/D-Conversion”, 2013
European Conference on Circuit Theory and Design (ECCTD 2013).
Third best student paper award.
[5] M. Suárez et al., ”A 26.5 nJ/px 2.64 Mpx/s CMOS Vision Sensor for
Gaussian Pyramid Extraction”, 2014 Proceedings of the European Solid-
State Conference (ESSCIRC 2014).
