AN ON/OFF SPIKING PHOTORECEPTOR FOR ADAPTIVE ULTRAFAST/ULTRAWIDE DYNAMIC RANGE VISION CHIPS

Timothy G Constantinou¹, Patrick Degenaar², Donal Bradley³, Chris Toumazou⁴

¹ Department of Electronic Engineering, Imperial College, London, UK
² Department of Bioengineering, Imperial College, London, UK
³ Department of Physics, Imperial College, London, UK
⁴ The Institute of Biomedical Engineering, Imperial College, London, UK

ABSTRACT

In this work we present an adaptable spike generator circuit for intelligent vision chips. Our aim is to realize an adaptable vision chip which can adjust for localized gain control, wide dynamic range, high temporal response, high spatial resolution and low power. To this end we have developed pulse frequency modulation spike encoder which is capable of providing very high dynamic ranges with power consumption similar to animal retina. At lower dynamic ranges, frequency responses of over 2kHz is possible.

Our circuit is based on the on-off opponency algorithm used by the human eye, and allows for integration of other functions such as tomographic increase in spatial resolution and information compression. Both silicon area and power consumption are kept to a minimum.

This paper discusses the algorithm, its implementation and simulated results describing its response and power consumption.

1. INTRODUCTION

The recent dramatic rise in popularity of digital imaging has been the result of the quality and cost performance of charge coupled device (CCD) technology. CCD’s have improved to be very capable for high-end imaging applications; for example in taking high quality color images at mega-pixel resolutions. However, CMOS imagers offer superior integration, power dissipation and system size at lower cost. The integration of the phototransduction elements with image processing, in recent years, has prompted vigorous activity in the development of smart CMOS imagers. The retinomorphic vision community has been taking the process a step further by attempting to implement algorithms inspired by animal visual systems to develop intelligent vision chips.

The human eye is capable of detecting more than 8 orders of magnitude of light intensity. It can achieve this with frequency responses of 25Hz in perceptive vision and up to 150Hz in microsaccades. This remarkable capability is rooted in the rhodopsin photocascade [1] in the rods and cones within the retina in addition to on-off spike rate encoding scheme.

Inorganic silicon photodiodes are capable of up to five orders of magnitude of dynamic range. Usually however, only an 8-bit dynamic range per channel is implemented due to the relatively high power consumption of higher bit-rate signal conditioning circuits. This can be a problem for scenes with large variations in image intensity. Early work by Delbrück and Mead [2] [3] led to an adaptive photoreceptor which could detect the contrast regardless of overall light intensity. This chip became the mainstay of the neuromorphic vision community.

It is however possible to use a spike rate encoding algorithm similar to that in animal vision such as the human eye [4] [5]. By changing from voltage or current space to frequency space, it is possible to achieve wide dynamic ranges at lower power consumption. Adaptive changes between high dynamic range and high frequency can be changed by changing the number of spikes to be counted.

The major drawback of any integrating system is that the frequency response is low for low light intensities. Here again we can learn from nature by implementing complementary on and off channels [5]. Using on-off opponency, where on-cells spike at high frequency at high light levels, and off-cells spike at high frequency at low light levels, there will always be an adequate frequency response, even at low light levels. Therefore either the on or off cell will always provide a high firing rate and thus a fast frequency response. To counteract redundancy in the output information stream, a winner-takes-all type of circuit has been implemented to only output the maximum firing rate with an extra signal to state whether it is an on or off signal.

2. SYSTEM ALGORITHM

In previous neuromorphic vision chips pixel sizes have tended to be around a 100x100µm in size with a fill factor of around 10%. This has tended to work against creating imaging chips with high pixel densities. Our configuration can be seen in Figure 1. A single spike encoder is shared between 7 photodiodes, thus increasing the fill factor. The spike encoder can take inputs from individual or all of the
photodiodes using a switched arrangement. The current is buffered and mirrored to create on and off channels. In the off channel the current is inverted, such that low photocurrents create high off-currents and vice versa. The two channels then compete by integrating their currents into voltages through capacitance. This voltage is released in the form of a spike once a trigger threshold has been surpassed and the charge collected is reset. To reduce redundancy only the first spike, whether on or off is released and both channels are reset. A complementary output is sent to indicate an on or off spike. Hysteresis is added to stop the circuit oscillating between on and off when the light intensity is close to the threshold between light and dark channels.

The circuit schematic can be seen in Figure 2. A simple current mirror is used to copy the photocurrent (iphot0) to two separate branches. One copy is fed into a predefined current sink (ibias) which effectively produces the difference of these two currents, i.e. (ibias-iphoto.) This in turn is mirrored to obtain the OFF current. The bias current is chosen such that at maximum light intensity the OFF current is zero, i.e. ibias=iphoto(max.)

The ON and OFF currents then are used to create an increasing voltage, by means of integrating these into the parasitic capacitance of their respective nodes. High-gain digital buffers are used to threshold detect and the first channel (ON or OFF) to reach threshold is collected through a logic OR gate. An additional output is provided to specify whether the response is ON or OFF by using an RS flip-flop to determine which channel is dominant. Hysteretic feedback is provided to the digital buffers to provide a 10-20% lag on channel selection changeover to prevent rapid channel toggling when the ON and OFF responses are comparable. The two outputs (SPIKE and ON/OFF) are also combined into a single signal by using a logic XOR operation.

The circuit implementation can be seen below in Figure 3, the total silicon area being 880 µm². This would lead to a fill factor of 52% with 30µm x 30µm photodiodes. However, it is possible to share the spike generator circuit amongst multiple photodiodes to increase the fill factor and/or photodiode density. In our configuration we envisage sharing the photoreceptor between a local hexagonal neighbourhood containing 7 photodiodes.

Test photodiodes have been previously fabricated and tested in the target technology (UMC 0.18µm CMOS.) The measured current/intensity characteristics for the photodiode to be used can be seen in Figure 4. The response is linear for an incident power from 50pW to greater than 100nW. The minimum detectable light is set by the dark current which is determined by the bias. While increasing bias increases the frequency response and quantum efficiency, it also increases the dark current by a
much greater factor. In our configuration the photodiode is reverse biased by 1.5V leading to a dark current of 4.8fA/\mu m². The photodiode quantum efficiency is 76% for 530nm wavelength.

The current characteristics from this photodiode were used in the circuit simulation for the spike generator. For the purposes of the circuit simulation we have used a current variation of 1pA to 10nA corresponding to 25nWcm⁻² to 2.6mWcm⁻². These 5 decades of intensity variation correspond to the difference between starlight and a well lit room. The photocurrents on a 7 pixel photoreceptor group can be added to give better dynamic range in dark conditions.

5. CIRCUIT SIMULATION

The circuit was simulated using the Cadence Spectre (5.0.33) simulator with BSIM 3v3 models for the MOS devices combined with a photodiode model derived from the measured parameters, shown in Figure 4 above.

The simulation results for the individual ON and OFF channels are shown in Figure 5. The slow responses can be seen for the ON channel at low light intensities and the OFF channel at high light intensities.

The simulation results of the competing ON-OFF channel spike generator are shown in Figure 6. The response shows good variation between light and dark over many orders of magnitude and the final output shows good distinction between ON and OFF channels. The hysteresis at the transitions can also be clearly seen.

The spike interval at the maximum firing rate is 3\mu s, corresponding to over 500kHz in response when considering that the data is effectively compressed by half. 500kHz is sufficient to provide a 16-bit dynamic range at 5Hz refresh on a single pixel. This maximum firing rate is limited by the combined parasitic capacitance at the integrating node. This capacitance is due mainly to the large PMOS device used for supplying the charging currents, in addition to all other smaller devices connected to this node. The spiking rate could be increased by scaling the current mirrors at the expense of quiescent power consumption.

The power dissipation of this circuit can be expressed due to two sources; the continuous current flow in the current mirrors; the static power and the digital switching; the dynamic power. The total current consumption is illustrated in Figure 7.

The quiescent (or static) current consumption is approximately 3nA. The dynamic current consumption is 370\mu A per 1.5ns spike in a 3\mu s window. Thus the energy consumption per spike is 500fJ. Given the competition between the ON and OFF channels the minimum frequency the circuit will operate at is 500Hz. In this regime the quiescent power consumption is 5nW compared to 125pW for the spiking. However for most of the operation at 5kHz to 500kHz it is the quiescent power consumption which will dominate. Thus, averaging this quiescent power over the pulse train gives 20.5pJ of energy per spike, which is comparable to the bit-energy of 2-20pJ/bit for the blow fly retina [1]. An 8-bit output therefore has a power equivalent of 5nW per pixel.
6. DISCUSSION

In this paper we present a spiking photoreceptor which is capable of providing a high frequency response. We envisage its use for autonomous robotic vision applications, where the robot would wish to switch between different modes of vision such as high temporal-low spatial response and high spatial-low temporal response. Adding a jitter to the imaging system would allow for spatial tomography. These adaptable capabilities are given in Table 1.

<table>
<thead>
<tr>
<th>Mode</th>
<th>Dynamic range</th>
<th>Frequency response</th>
<th>Shared photo-diodes</th>
</tr>
</thead>
<tbody>
<tr>
<td>Temporal</td>
<td>8-bits</td>
<td>2kHz</td>
<td>1</td>
</tr>
<tr>
<td>Spatial</td>
<td>8-bits</td>
<td>140Hz</td>
<td>7</td>
</tr>
<tr>
<td>Spatial with x10</td>
<td>8-bits</td>
<td>14Hz</td>
<td>7</td>
</tr>
<tr>
<td>High dynamic range</td>
<td>16-bit</td>
<td>5Hz</td>
<td>1</td>
</tr>
</tbody>
</table>

Table 1: Various configurations with different dynamic range possibilities

The asynchronous output of the system could be connected to existing address event registration protocols [6] or addressing pulse counter protocols. This in effect is similar to the way the human eye functions, where the majority of the photoreceptors form the high temporal resolution (up to 150Hz) peripheral vision and a small percentage act at lower temporal resolution (25Hz) but higher spatial resolution to form the retina. In fact these frequency responses are increased to allow for saccadic jitter which may be responsible for increasing spatial resolution (so-called hyperacuity) in the eye [7].

7. CONCLUSION & FURTHER RESEARCH

We have presented a biologically inspired technique to obtain optical information from vision chips. Both high frequency responses and dynamic ranges are possible. The power consumption is 20pJ per spike. A chip has been sent out for fabrication and further work is being carried out on the switching and tomography protocols. The main challenges involve compressing the information contained in the output spikes, connecting to AER type protocols, and in developing a scalable imaging chip architecture based around the hexagonal array.

8. ACKNOWLEDGEMENTS

The authors wish to acknowledge the Basic Technology grant (UKRC GR/R87642/02) and the AMx technology grant (EPSRC GR/R96583/01) in addition to Toumaz technology Limited for supporting this research.

9. REFERENCES


