Edge devices equipped with computer vision must deal with vast amounts of
sensory data with limited computing resources. Hence, researchers have been
exploring different energy-efficient solutions such as near-sensor processing,
in-sensor processing, and in-pixel processing, bringing the computation closer
to the sensor. In particular, in-pixel processing embeds the computation
capabilities inside the pixel array and achieves high energy efficiency by
generating low-level features instead of the raw data stream from CMOS image
sensors. Many different in-pixel processing techniques and approaches have been
demonstrated on conventional frame-based CMOS imagers, however, the
processing-in-pixel approach for neuromorphic vision sensors has not been
explored so far. In this work, we for the first time, propose an asynchronous
non-von-Neumann analog processing-in-pixel paradigm to perform convolution
operations by integrating in-situ multi-bit multi-channel convolution inside
the pixel array performing analog multiply and accumulate (MAC) operations that
consume significantly less energy than their digital MAC alternative. To make
this approach viable, we incorporate the circuit's non-ideality, leakage, and
process variations into a novel hardware-algorithm co-design framework that
leverages extensive HSpice simulations of our proposed circuit using the GF22nm
FD-SOI technology node. We verified our framework on state-of-the-art
neuromorphic vision sensor datasets and show that our solution consumes ~2x
lower backend-processor energy while maintaining almost similar front-end
(sensor) energy on the IBM DVS128-Gesture dataset than the state-of-the-art
while maintaining a high test accuracy of 88.36%.Comment: 17 pages, 11 figures, 2 table