A micropower vision processor for parallel object positioning and sizing by Constandinou, TG & Toumazou, C
A Micropower Vision Processor for
Parallel Object Positioning and Sizing
Timothy G Constandinou and Chris Toumazou
The Institute of Biomedical Engineering, Imperial College London
South Kensington Campus, London, SW7 2AZ, United Kingdom
Email: t.constandinou@imperial.ac.uk, c.toumazou@imperial.ac.uk
Abstract—A hybrid vision chip is presented for real-time
object-based processing for tasks such as positioning and sizing of
enclosed objects. This system presents the ﬁrst artiﬁcial silicon
retina capable of position and size determination of multiple
objects in true parallel fashion. Based on a novel distributed
algorithm, this approach uses the input image to enclose a
feedback loop to realise a data-driven pulsating action. The
fabricated device is shown to achieve a computation-efﬁciency
of at least 725 million instructions per second per milliwatt and
capable of processing up to 2000 frames per second.
I. INTRODUCTION
Centroid detection and target tracking have been tasks
traditionally associated with advanced military and space
applications. However these same tasks are fundamental to
the more generic ﬁeld of image recognition. Traditional image
processing techniques effectively perform low-level tasks such
as conditioning and ﬁltering but typically output a matrix of
pixels, constituting an image. For perceptive vision applica-
tions it is paramount to cluster together pixels in a region of
interest and provide a single entity. This task is often referred
to as object segmentation. Having performed this, it is useful
to deﬁne the object using a co-ordinate and magnitude to
represent its centroid and size respectively. Having such high-
level information available can provide enhanced and added
functionality to several applications. For example, in machine
vision for autonomous navigation, automation of surveillance
or security camera tasks, image stabilisation for medical
instrumentation and biochemical cellular migration/population
analysis.
Previous work has produced many centroiding vision chips,
Eg. [1] [2] based on the centre-of-mass (COM) computation
utilising two (i.e. row and column) one-dimensional summa-
tions at the side of the array. Other systems have combined
this with additional functionality, for example, to combine
centroiding with motion detection [3] [4] or embed an APS
imager [5]. Multiple centroid capability has been achieved by
using a window and search type algorithm to direct the COM
computation [6] [7].
This paper reports the ﬁrst vision chip capable of object-
position detecting and sizing of unlimited objects in genuine
parallelism.
II. BIO-PULSATING CONTOUR REDUCTION ALGORITHM
This system directly implements the “Bio-pulsating Contour
Reduction” algorithm [8], designed for object-based process-
Photodiode
Averaging, 
Comparison,
and Thresholding
Vertical 
Edge Detection
Horizontal
Edge Detection
Delay
sai
B
n
oit
u
birtsi
D
State Reset
AER Sender
Contour
State Memory
State Set MUX
R
O-evis
ulcxE
ASP
ABP
Distributed “Bus”
58
μ
m
85μm
Fig. 1. The regular cell layout (top) and ﬂoorplan (bottom). The cell size
is 85µm×85µm with 30µm×30µm active photodiode area, giving a 12.5%
surface ﬁll factor. Metal layers 5 and 6 have been excluded for clarity.
ing including positioning and sizing of simple objects. Circular
blob-like objects with an intensity differing from the back-
ground level can be segmented and their size and position
(“centroid”) determined using a distributed binary algorithm.
It is important to stress that the detected “centroid” is only
an accurate centre-of-mass (COM) for regular, round objects,
otherwise this represents an estimation to object position.
This uses an edge-detection technique to form the contours
and trigger the data-driven processing. On detection of an
object boundary, the initial state for the signal ﬂow is set. By
propagating an inward ﬁll, the contour can be reduced until
it converges to the “centre”. The central point is detected by
utilising spatiotemporal integration. On “centroid” detection,
the object is reset and output transmitted, thus realising an in-
ward pulsating action. Furthermore, the frequency of pulsation
can be directly used to determine the object size.
III. SYSTEM ARCHITECTURE AND CIRCUIT
IMPLEMENTATION
The complete system architecture and circuit implemen-
tation is illustrated in Fig. 2. The architecture consists of
three main components: the address-event communication
hardware, the current distribution network and the distrib-
uted pixel processing array. The address-event communication
implements a fully-arbitrated scheme adapted from [9] (cir-
cuits shown in Fig. 2c). The current distribution network is
used to duplicate bias currents at close physical proximity
and thus reduce mismatch error. The pixel implementation
(layout and ﬂoorplan given in Fig. 1) consists of three main
components: the photodiode (30µm×30µm n-well/p-substrate
(a) System Architecture
(e) Asynchronous Binary Processing (ABP)
(d) Analogue Signal Processing (ASP)
EDGEv_photo2
v_photo1
v_tune
i_bias i_limit i_limit
Edge Detector
i_photo(out)
i_photo(in)
MODE
v_globalv_local
MIN(i_local,
i_global)
i_local i_global
THRESHOLD
Photo-
transduction
Local
averaging
Global
averaging
Logarithmic
compression
Comparison
Threshold Detector
EDGE1
THRES
EDGE2
EDGE3
EDGE4
CONTOUR
Contour Detector
 (b) Pixel Organisation
CENTRESTATE
RESET
PIXEL EDGE
EDGE
PIXEL
PIXEL PIXEL
THRES.
EDGE EDGE
LOCAL
AVG.
GLOB.
AVG.
AER
CONT-
OUR
ASP ABP
ABP
BUS
ABP
BUS
AER
OUT
AER
OUT
v_global
CONTOURTHRESHOLD
i_bias BIAS
COPIER
EDGE
DETECTOR
EDGE
DETECTOR
CONTOUR
DETECTOR
THRESHOLD
DETECTOR
i_photo(in)
i_photo(out)
EDGE<3:0>
v_photo
v_photo1,2
ASP Organisation
FILL
STATE
RESET
STATE
CENTRE
STATE
CENTRE
FILL
RESET
DELAY
THRESHOLD
CONTOUR
RESET_INHIBIT
FILL_INHIBIT
FILL<7:0>
(from Adjacent Cells)
CENTRE<7:0>
(from Adjacent Cells)
RESET<3:0>
(from Adjacent Cells)ABP Organisation
(c) Address Event Representation
ACK1
REQ1
REQ2
ACK2
REQ
ACK
Arbiter
BUS_REQ
REQ
Vb
REQ
ACK
ACK
RES
BUS_ACK
Row/Column Latch
RES_YRES_X
S
R Q
QCENTRE
REQ_X
ACK_Y
REQ_Y
Pixel Handshake
IN OUT
i_limiti_delay
Artificial Propagation Delay FILL/RESET/CENTRE State Machines (RS Flip-flop based)
F_set(t + δt) = Co(t) + Th(t) · (F1(t) + F2(t) + F3(t) + F4(t))
F_reset(t + δt) = C(t − δt) · !C(t) + F(t) · (R1(t) + R2(t) + R3(t) + R4(t))
C_inhibit(t + δt) = C1(t) + C2(t) + C3(t) + C4(t) + C5(t) + C6(t) + C7(t) + C8(t)
C_set(t + δt) = !C_inhibit · !F(t) · F12(t) · F22(t) · F32(t) · F42(t)
C_reset(t + δt) = !F(t) · F1(t) · F2(t) · F3(t) · F4(t)
R_reset(t + δτ) = R_set(t)
R_set(t + δt) = F_reset(t+δt)
PIXEL PIXEL PIXEL PIXEL
PIXEL PIXEL PIXEL PIXEL
PIXEL PIXEL PIXEL PIXEL
PIXEL PIXEL PIXEL PIXEL
Current Distrubution
Master Reference
AER Bus
COL.
LATCH
COL.
LATCH
COL.
LATCH
COL.
LATCH
ARBITER ARBITER
ARBITER
Address Event Representation Async. Handshake
RO
W
LATC
H
RO
W
LATC
H
RO
W
LATC
H
RO
W
LATC
H
A
RBITER
A
RBITER
A
RBITER
00 01 10 11
00
01
10
11
 
Fig. 2. Implementation details for the Micropower Object Positioning Vision Processor. Shown are: (a) General system architecture, (b) in-pixel organisation,
(c) Address-Event Representation circuits, (d) Analogue Signal Processing circuits and (e) Asynchronous Binary Processing Circuits.
parasitic junction), Analogue Signal Processor (ASP) and
Asynchronous Binary Processor (ABP).
A. ASP Circuits (Fig. 2d)
These circuits feature extract two binary signals per pixel
(THRESHOLD and CONTOUR) from the input image.
1) Threshold Detection: Current-mode averaging and cur-
rent comparator circuits are used to generate the THRESH-
OLD signal indicating whether a pixels intensity is below or
above the global average.
2) Edge/Contour Detection: A comparator based on a
current-starved differential pair [10] is used to edge detect
between every two adjacent pixels. Subsequently, using EDGE
signals from adjacent pixels, the CONTOUR signal is pro-
duced on determining a continuous edge.
B. ABP Circuits (Fig. 2e)
These circuits facilitate the distributed binary algorithm to
“centroid” detect acting upon binary outputs of the ASP.
1) FILL state machine: This determines when a pixels ﬁll
status becomes asserted, facilitating the inward “ﬁll”. This is
either due to the pixel lying on a object contour or to an inward
propagation.
(0,0)
(0,47) (47,47)
(47,0)
(a) Regular object 1
(24,40)
r=7
(0,0)
(0,47) (47,47)
(47,0)
(b) Regular object 2
(34,40)
r=10
(0,47) (47,47)
(34,36)
r=7
(0,0) (47,0)
(d) Irregular object 1 (0,47) (47,47)
(34,36)
r=6
(0,0) (47,0)
(e) Irregular object 2 (0,47) (47,47)
(32,33)
r=8
(0,0) (47,0)
(f ) Irregular object 3
(47,47)
(0,47) (47,47)
(13,13)
r=7
(13,37)
r=7
(23,37)
r=6
(0,0) (47,0)
(g) Multiple objects 1 (0,47) (47,47)
(10,9)
r=7
(31,32)
r=14
(0,0) (47,0)
(h) Multiple objects 2 (0,47) (47,47)
(24,23)
r=6
(39,39)
r=7
(9,39)
r=7
(9,9)
r=7
(39,8)
r=7
(0,0) (47,0)
(i) Multiple objects 3
(25,23)
r=6
(0,0)
(0,47) (47,47)
(47,0)
(c) Regular object 3
Fig. 3. Test images with pixel grid overlayed indicating measured centroid
position and size. Included are: (a-c) Regular objects, (d-f) Irregular objects
and (g-i) Multiple objects.
2) Artiﬁcial Delay: This limits the rate of inward propaga-
tion, i.e. an artiﬁcial delay is inserted before each pixels “Set
FILL” logic. The delay is created by thresholding to capacitive
charging of a bias current.
3) CENTROID state machine: This determines whether a
CENTRE has been detected, by aggregating FILL inputs from
surrounding cells.
4) RESET state machine: This determines when a pixels
ﬁll status becomes reset, facilitating the outward “unﬁll” or
back-propagation. This is either due to the pixel detecting a
CENTRE or to a back-propagation.
IV. FABRICATED SYSTEM
A. General Functionality
A custom testboard setup (Microcontroller/PC-based AER
readout) has been developed to conﬁrm system functionality
within the intended design speciﬁcations. Sample images,
projected onto the chip and corresponding measurements are
presented in Fig. 3. This illustrates both single and multiple
object position and size determination. Typically the measured
object position and size measurements are within the actual
boundaries. Furthermore, uneven objects are successfully de-
tected but with inaccurate centroid and position estimates,
again within the actual object boundaries (see Fig. 3e,f).
However, overlapping objects are detected as a single uneven
object (Fig. 3d).
B. Measured Results
1) Accuracy: As expected this system is intrinsically lim-
ited to single pixel resolution. An interesting observation has
been a small random deviation (±2 pixels) in object position,
resulting in a similar deviation in object size. This is able
(a)
(b)
Fig. 4. Pseudo-dithering providing increased accuracy through successive
averaging for object (a) position and (b) radius.
µ
Fig. 5. Dependance of process time on bias current, for input images with
maximum objects sizes of 3, 4, 5, 6 and 8 pixel radius.
to provide sub-pixel accuracy (through successive averaging)
having a pseudo-dithering effect. This is explained due to
an edge effect caused by an imperfectly focused image or
a graded object boundary. Subsequently, the static (spatial)
ﬁxed-pattern noise (FPN) coupled with the (temporal) ﬂicker
noise within the edge detector blocks provide this statistically-
biased dithering effect. This results in a mechanism to enable
processing time to be tradable with accuracy, illustrated by the
trend shown in Fig. 4.
2) Processing Time: Although the asynchronous nature of
this system produces temporally uncorrelated events between
different objects (due to the local resetting), the algorithm can
be run in a “single-shot” mode. This can be used by applying a
clock to the global reset input, therefore realising a clocked or
frame-based output. The limiting factor to processing speed is
the maximum object size being analysed, i.e. contour reduction
cycle. This can be tuned as the internal propagation delay is
controlled by the bias current, illustrated in Fig. 5.
3) Power Consumption: The measured power consumption
is generally within ±10% the simulated levels. The effect of
illumination and bias current on power consumption is illus-
trated in Fig. 6. An unusual feature is that by decreasing bias
current (below the 2.5nA nominal) it substantially increases
power consumption. This is because the transconductance
µµ
µ
µ
µ
(a)
(b)
Fig. 6. Measured total system current consumption illustrating dependance
on (a) illumination level and (b) edge detection bias current.
yarra lexi
p 84x84
n
oitaci
n
u
m
m
oc R E
A
n
oit
u
birtsi
d sai
B
0005
m
μ
5000μm
Fig. 7. Microphotograph of the Micropower Object Positioning Vision
Processor.
within the edge comparators is reduced, causing the digital
logic to operate with intermediate voltage inputs.
V. CONCLUSION
A vision processing chip has been presented for object
position and size determination. It is the ﬁrst system report-
ing parallel, multiple object (unlimited) processing capability.
Furthermore, the developed system demonstrates high com-
putational efﬁciency; implementing a computationally inten-
sive algorithm with micropower consumption. The fabricated
system (microphotograph shown in Fig. 7) has shown to
utilise ﬁxed pattern noise favourably; both reducing power
consumption (through increased mismatch on edge detectors)
and increased accuracy for moving objects (through dithering
with successive sampling). The achieved system speciﬁcation
is summarised in Table. I.
Technology UMC 0.18µm MM/RF CMOS
Supply voltage 1.8V core (3.3V I/O)
Bias current range 50nA to 2µA (for Iaer)
250pA to 10nA (for Ibias)
50pA to 2nA (for Itune)
Photosensitivity 100nW/cm2 to 100mW/cm2
Responsivity 0.32A/Wcm2 (λ=650nm)
Pixel size 85µm× 85µm
Surface ﬁll factor 12.46%
Pixel device count 277
Pixel power †96.48 nW (total)
Die dimensions 5mm × 5mm
Array size 48 × 48 pixels
System device count 745,200
System power †243.6 µW (total)
Accuracy (centroid and radius) ±1 pixel
Equivalent image process time 0.5ms
Address-event bandwidth 0.61 MHz (at Iaer=1µA)
Equivalent computational efﬁciency †1.38 µW per MIPS
†For n=5, r=10, IphotoAv=6µW/cm2, Ibias=2.5nA, Itune=250pA
TABLE I
SYSTEM PROPERTIES AND PERFORMANCE SUMMARY
ACKNOWLEDGMENT
The authors would like to acknowledge the Basic Tech-
nology grant (UKRC GR/R87642/02) and the AMx technol-
ogy grant (EPSRC GR/R96583/01), in addition to Toumaz
Technology Limited for supporting this research. The authors
would also like to thank Tor Sverre Lande and Julius Georgiou
for many useful discussions and Philipp Ha¨ﬂiger for providing
us with access to his address-event design libraries.
REFERENCES
[1] N. Massari, L. Gonzo, M. Gottardi and A. Simoni, “A Fast CMOS
Optical Position Sensor with High subpixel Resolution,” IEEE Trans.
on Instr. Meas., vol. 53, no. 1, pp. 116–123, 2004.
[2] B. H. Pio et al, “Integration of a Photodiode Array & Centroid
Processing on a single CMOS Chip for a Real-time Shack-Hartmann
Wavefront Sensor,” IEEE Sensors J., vol. 4, no. 6, pp. 787–794, 2004.
[3] R. Etienne-Cummings, V. Gruev and M. Abdel-Ghani, “VLSI Imple-
mentation of Motion Centroid Localization for Autonomous Naviga-
tion,” Advances in Neural Information Processing Systems, vol. 10,
pp. 685–691, 1998.
[4] G. Indiveri, “Neuromorphic analog VLSI sensor for visual tracking,”
IEEE Trans. Circuits Syst. II, vol. 46, no. 11, pp. 1337–1347, 1999.
[5] M. A. Clapp and R. Etienne-Cummings, “A Dual Pixel-type Array for
Imaging and Motion Centroin Localization,” IEEE Sensors J., vol. 2,
no. 6, pp. 529–548, 2002.
[6] J. Akita, A. Watanabe, O. Tooyama, M. Miyama, M. Yoshimoto, “An
Image Sensor with Fast Objects’ Position Extraction Function,” IEEE
Trans. Electron Devices, vol. 50, no. 1, pp. 184–190, 2003.
[7] T. Komuro, I. Ishii, M. Ishikawa and A. Yoshida, “A Digital Vision
Chip Specialized for High-Speed Target Tracking,” IEEE Trans. Electron
Devices, vol. 50, no. 1, pp. 191–199, 2003.
[8] T. G. Constandinou, T. S. Lande and C. Toumazou, “Bio-pulsating archi-
tecture for object-based processing in next generation vision systems,”
IEE Elec. Lett., vol. 30, no. 16, pp. 1169–1170, 2003.
[9] P. Ha¨ﬂiger, A Spike-based Learning Rule and its Implementation in
Analog Hardware”. PhD thesis, ETH Zu¨rich, Switzerland, 2000.
[10] T. G. Constandinou, J. Georgiou and C. Toumazou, “A Nanopower
Tuneable Edge Detection Circuit,” Proc. IEEE Int. Symp. on Circuits
Syst., vol. 1, pp. 449–452, 2004.
