Towards a bio-inspired mixed-signal retinal processor by Constandinou, TG et al.
TOWARDS A BIO-INSPIRED MIXED-SIGNAL RETINAL PROCESSOR 
Timothy G Constandinou
1
, Julius Georgiou
1,3
 and Chris Toumazou
1,2
1 EEE Dept, Imperial College of Science, Technology and Medicine, London, SW7 2BT, UK. 
2 Toumaz Technology Limited, Culham Science Centre, Abington, Oxfordshire, OX14 3DB, UK. 
3 Geosilicon Limited, 7 Thessalonikis Avenue, Strovolos, Nicosia 2020, Cyprus. 
ABSTRACT
A robust distributed architecture for real-time object-
based processing is presented for tasks such as object size, 
centre and count determination. This approach uses the 
input image to enclose a feedback loop to realize a data-
driven pulsating action. Outlined is the top level design 
for hardware implementation in a standard CMOS 
technology. 
1. INTRODUCTION
A modern advanced image processing system uses an 
external camera to stream the image data to the processor, 
executing a software algorithm. Such a modular scheme 
demands huge bandwidth requirements for the video 
transmission and therefore heavy power requirements. 
Several early filtering applications can benefit from 
combining the phototransduction and processing at the 
pixel level. A new breed of vision chips have recently 
emerged that strive to achieve precisely this. A generic 
reconfigurable architecture to provide such pixel-level 
processing is the cellular neural network processor [1]. 
Other systems have been inspired by the unparalleled 
computational efficiency of living organisms in solving 
complex image processing tasks. These biologically-
inspired (or retinomorphic [2]) systems have been realized 
to perform tasks such as image enhancement and feature 
extraction. Object-based processing is a fundamental task 
for many early vision applications. The segmentation of 
various objects in an image has been traditionally 
implemented in software using techniques such as the 
snake algorithm [3]. It has not been till recently, that 
dedicated hardware has been developed for such tasks as 
object-based attention selection [4] and contour length 
measurement [5]. This paper proposes a novel scheme [6] 
suitable for such object-based computation based on a 
distributed processing architecture. Although several of 
the features have been biologically inspired, the algorithm 
is fundamentally synthetic. By using this hybrid approach, 
a realistically hardware implementable system can be 
developed benefiting from increased computational 
efficiency provided by the bio-inspired analogue 
processing elements. The reduced power consumption 
enables realization of mobile diagnosis devices which 
would otherwise be technically unachievable. 
The target application for hardware implementation is 
microscopic cellular population analysis as a 
microelectronic alternative to haemocytometry. The 
developed system (ORASIS) is required to provide 
cellular count and size information on microscopic images 
such as those shown in figure 1. 
Fig 1. Sample input images of red-blood cell specimens for 
application in microscopic cellular population analysis 
2. ALGORITHM [6] 
A continuous-time edge-detection technique is used to 
form the contours and trigger the data-driven processing. 
On detection of an object boundary, the initial state for the 
signal flow is set. By propagating an inward fill, the 
contour can be reduced until this converges to the centre. 
The central point is detected by utilizing spatiotemporal 
integration; i.e. a summation of the cells set within the 
receptive field within a certain time window. On centroid 
detection, the object is reset and output transmitted, thus 
realizing an inward pulsating action. The frequency of 
pulsation determines the size, i.e. radius of this object. 
Figure 2 illustrates this interaction graphically through 
computer simulations. 
This scheme can be applied in two different modes of 
operation; either as a single shot “capture and process” 
mode or using the above described continuous pulsating 
mode. The trade-off between these two modes of 
V - 4930-7803-8251-X/04/$17.00 ©2004 IEEE ISCAS 2004
à á
operation is accuracy (due to averaging) versus power 
consumption (due to increased duty cycle.)  
t=0 t=1 t=2 t=3 t=4
t=5 t=6 t=7 t=8 t=9 t=10
input
Fig. 2 Computer simulation results of the bio-pulsating 
contour reduction algorithm, with snapshots taken at time 
intervals at the propagation delay of the processing. 
3. METHOD
Objects are defined as regions in the image with intensity 
below (or above) the average level of the input image. 
The edges are detected by computing the difference in 
neighbouring cell intensities and contours are formed if a 
continuous edge is found; i.e. at the nodes which have two 
edges leading to them. The contour reduction is facilitated 
by setting a cells state if any of its adjacent cells have 
been set in addition to the object criterion being satisfied. 
The rate of the contour reduction is preset by introducing 
a delay element in the propagation cycle. The unusual 
feature of this method is the absence of any pre-defined 
synchronisation signal, for example, a clock. The only 
synchronisation is obtained through the data-driven object 
reset scheme but on a local, rather than a global basis. The 
reset is generated on detection of an object centre. As 
previously mentioned, this detection involves counting 
that all local pixel-cells have been set within a certain time 
period. 
4. BIOLOGICALLY INSPIRED APPROACH 
Many of the implemented circuits and functions have in 
fact been biologically inspired. As in the mammalian 
retina, the front-end circuitry includes continuous time 
logarithmic photo detection, in addition to localised 
smoothing (averaging) and adaptive edge detection for the 
signal conditioning. Furthermore, the signal propagation 
based on localised interaction works in a similar way to 
the orientation-selective V1 cells in the primary visual 
cortex. The centroid determination is implemented using a 
pseudo centre-surround receptive field technique; very 
similar to the functional organisation of the retina. This 
has been implemented using delay and propagate, 
integrate and fire type neuronal circuits; producing a truly 
spike-domain output as in the case of ON/OFF ganglion 
cells.
5. HARDWARE IMPLEMENTATION 
The presented architecture is currently being realized into 
circuit blocks for implementation in a standard 0.18µm 
CMOS process provided by UMC. The circuit topology is 
a unique combination of both weak-inversion analogue 
providing micropower operation with asynchronous logic 
for robustness and stability.
The complete top level system architecture is shown in 
Figure 3. This contains an X*Y array of smart pixels; 
containing both the photodetecting devices and local 
processing circuitry. At the column and row headers are 
address encoders which relay the data received through a 
digital bus for off-chip communication. Such an encoding 
scheme is often referred to as address event representation 
(AER.) This approach is possible due to the very low 
output bandwidth requirement that avoids the polling of 
all pixels. 
Containing several current-mode circuits; each pixel 
requires a bias current reference. A current-mode 
distribution scheme is adopted implementing a tree-like 
hierarchy. The PTAT master reference supplies the bias 
currents to the four corners of the array. These corner 
currents are then duplicated for each row and 
subsequently for each column, resulting in each pixel 
receiving an individual bias. This vastly reduces errors 
arising from bias current variations; a major headache 
when using voltage-mode current distribution. The 
improved current matching is due to using low-proximity 
current-copying circuits thus minimizing any mismatch 
errors; discussed in further detail in section 6. 
Master Current
Reference
(PTAT)
Current
Copiers
Current
Copiers
Current
Copiers
Current
Copiers
Current
Copiers
Current
Copiers
Current
Copiers
Current
Copiers
Current
Copiers
Current
Copiers
Current
Copiers
Current
Copiers
Column encoder
Pixel
Cell
Pixel
Cell
Pixel
Cell
Pixel
Cell
Pixel
Cell
Pixel
Cell
Pixel
Cell
Pixel
Cell
Pixel
Cell
Pixel
Cell
Pixel
Cell
Pixel
Cell
Pixel
Cell
Pixel
Cell
Pixel
Cell
Pixel
Cell
R
ow
 e
n
co
de
r
AE
R
Control and
Tuning logic
Spike-domain output. Address
Event Representation (AER)
encoded centroid position outputs
Edge detection tuning,
reseting (global vs. local)
and thresholding schemes.
Off-chip resistor for reliable
and tunable bias current
generation
 Fig. 3 Top level system architecture of an X*Y array 
illustrating the bias distribution and output readout schemes. 
V - 494
á á
In order to facilitate the contour computation, the 
processing must occur at the pixel corners, as illustrated in 
the pixel-cell architecture shown in Figure 4.  
The required functional (pixel-level) blocks; all 
continuous time topologies, are given below: 
a. Light detection: Active photodiode (continuous time) 
utilizing n+ implant p-substrate junction diodes.  
b. Edge detection: Discrete output using thresholding 
technique [7] utilizing differential current-mode 
hysteresis for computation of object contours. 
c. Local averaging: Narrow-field for input image 
smoothing and wide-field for object detection; using 
current-mode circuitry for thresholding. 
d. Local resetting: Dynamic switching regulated with 
local average current-mode thresholding [7] for 
object segmentation, to provide localised (object) 
resetting. 
e. Neuromorphic logic: performing delay-and-propagate 
computation for signal flow and centre-surround-like 
computation for centroid determination. 
f. Memory: Basic 1-bit memory implemented using 
digital (asynchronous) RS flip-flop for storage of 
present cellular state. 
LOCAL +
GLOBAL
AVERAGE
LOCAL
MEMORY
NEURO
LOGIC
DYNAMIC
SWITCH
(LOCAL
RESET)
PIXEL PIXEL
PIXEL PIXEL
EDGE
EDGE
EDGE EDGE
PROCESSING CORE
Fig. 4 Proposed cellular architecture for object-based 
processing illustrating organisation and connectivity of 
functional blocks within a quad-pixel arrangement. 
 Memory output (to local cells) 
 Local averages (to local cells) 
 Readout grid (AER scheme) 
 Dynamic reset path (forms object node) 
 Pixel / edge signal 
6. DEVICE MISMATCH 
A fundamental design issue for ensuring circuits operating 
in weak inversion will work is device matching. The 
device mismatch arises from process parameter variations 
mainly in gate oxide thickness and doping concentrations, 
resulting in device threshold voltage and drain current 
variations. Since the gm/I ratio is at a maximum for 
devices operating in weak inversion, this signifies that 
subthreshold circuits are those most affected by device 
mismatch [8].  
In designing a system requiring image acquisition 
capabilities in standard CMOS technology, careful 
consideration must be taken into such sources of error. 
Non-uniformities in the pixel array; referred to as fixed 
pattern noise (FPN) are mainly due to offset and gain 
mismatches between the in-pixel amplifiers. This error, if 
uncompensated for, would normally render a processing 
algorithm unusable; however, the method presented has 
proved robust. Through computer simulations, the 
inherent immunity to both FPN and physical defects has 
been verified; discussed in section 7. For both this reason 
and the high lighting conditions present in the target 
application, the required dynamic range is limited and 
therefore the FPN will not pose a serious problem.  
However, mismatch errors are not limited to FPN. All 
circuit blocks requiring critically matched device pairs or 
groups are susceptible to such errors, for example 
differential pairs or current mirrors. Subsequently all such 
circuits require additional attention from schematic design 
through layout. Specialist simulation techniques such as 
Monte-Carlo analysis in additional to careful layout [9] 
can minimize these mismatch errors to both improve 
performance characteristics and production yields. 
7. SIMULATED RESULTS 
The proposed system has been simulated at all levels; 
from the top-level distributed algorithm, to the bottom-
level photodiode device physics. These have been 
facilitated using a selection of simulation tools including 
the Cadence Spectre Simulator and MATLAB in addition 
to custom developed code. For the scope of this paper, 
only the high-level algorithmic simulations shall be 
discussed.
By using the above mentioned mathematical tools, the 
distributed algorithm has been simulated with a wide 
variety of input images. Artificial fixed-pattern noise 
(FPN) and process defects have been introduced to 
demonstrate the inherent robustness and fault-tolerant 
properties of the contour-reduction algorithm. Through 
successive simulation using randomly generated noise and 
V - 495
á á
defect errors, statistical data has been compiled to 
illustrate a trend for the robustness and stability, given in 
Figure 5. 
0
1
2
3
4
5
6
0 5 10 15 20Defects
Er
ro
r 
(%
)
0
5
10
15
20
-50 -40 -30 -20 -10 0 10 20 30 40 50FPN (%)
Er
ro
r 
(%
)
Fig. 5 Statistical data illustrating robustness to defects (top) 
and FPN (bottom,) acquired through successive computer 
simulation of the bio-pulsating contour reduction algorithm.
 Object size computation 
 Object count computation 
8. TARGET SPECIFICATIONS 
The target design specifications for the ORASIS chip are 
listed in table 1. 
Technology UMC 0.18µm1P6M CMOS 
Supply voltage 1.8V core (3.3V I/O) 
Dynamic Range from 50mWm
-2 to 5kWm-2
Responsivity 0.2AW
-1m-2 @ ?=500nm
Maximum tolerable FPN +/- 15% 
Cell Area 90µm x 90µm
Active fill factor 11%
Pixel power 18nW (typical) 
Edge power 20nW (maximum) 
Averaging power 95nW (typical) 
Logic power 5nW (maximum) 
C
ellu
la
r
Total online power 138nW
Chip area 25mm
2
Array resolution 40 x 40 cells 
Total array power 345µW (typical) 
Total periphery power 100µW (maximum) 
Total online power  345µW
Duty cycle (online) 10%
S
y
stem
 
Total effective power  44.5µW
Table 1 Target design specifications for ORASIS cell- and 
system level hardware implementation 
9. CONCLUSION
This paper outlines the top-level hardware implementation 
of the bio-pulsating contour reduction algorithm [6]. This 
is a parallel, distributed algorithm performing 
asynchronous object recognition breaking the bottleneck 
of traditional, sequential von Neuman based 
computational paradigm. The globally asynchronous 
scheme is regulated by employing data-generated local 
synchronisation, reducing power consumption and 
improving the signal-to-noise ratio. By incorporating the 
processing in the front end, the bandwidth requirements 
have been reduced by at least four orders of magnitude. 
Both the functionality and robustness have been verified 
through extensive computer simulation and by 
implementing an explicit architecture; the hardware 
realisation has been targeted for micropower operation, 
realising a retinal vision processor.  
10. ACKNOWLEDGEMENTS
The authors would like to acknowledge the Basic 
Technology grant (UKRC GR/R87642/02) and the AMx 
technology grant (EPSRC GR/R96583/01,) in addition to 
Toumaz Technology Limited for sponsoring this research. 
11. REFERENCES
[1] T. Roska, A. Rodriguez-Vazguez, “Towards visual 
microprocessors,” Proc. IEEE, Vol. 90, pp. 1244-1257, 
2002.
[2] K.A. Boahen, “A retinomorphic vision system,” IEEE 
Micro, Vol. 16, pp. 30-39, 1996 
[3] M. Kass, A. Witkin and D. Terzopoulos, “Snakes: active 
contour models,” Int. J. Comput. Vision, Vol. 1, pp. 321-
331, 1988 
[4] T.G. Morris, T.K. Horiuchi and S.P. Deweerth, “Object-
based selection within an analog VLSI visual attention 
system,” IEEE Trans. Circuits Syst. 2, Vol. 45, pp. 1564-
1572, 1998. 
[5] S Liu and J.G Harris, “Dynamic wires: an analog VLSI 
model for object processing,” Int. J. Comput. Vision, Vol. 
8, pp. 221-239, 1992. 
[6] T.G. Constandinou, T.S. Lande, C. Toumazou, “Bio-
pulsating architecture for object-based processing in next 
generation vision systems,” IEE Electronics Letters, Vol. 
39 (16,) pp. 1169-1170, 2003. 
[7] T.G Constandinou, J. Georgiou and C. Toumazou, “A 
nanopower mixed-signal edge-detection circuit for pixel-
level processing in next generation vision systems,” IEE 
Electronics Letters, Vol. 39 (25,) pp. 1774-1775, 2003. 
[8] A. Papasovic, A.G. Andreou, C.R. Westgate, 
“Characterisation of Subthreshold MOS Mismatch in 
Transistors for VLSI Systems,” Analog IC’s & Signal 
Proc., Vol. 6, pp. 75-85, 1994. 
[9] R.A. Hastings, “The Art of Analog Layout,” Prentice Hall, 
2000.
V - 496
á à
