The high volume of data transmission between the edge sensor and the cloud
processor leads to energy and throughput bottlenecks for resource-constrained
edge devices focused on computer vision. Hence, researchers are investigating
different approaches (e.g., near-sensor processing, in-sensor processing,
in-pixel processing) by executing computations closer to the sensor to reduce
the transmission bandwidth. Specifically, in-pixel processing for neuromorphic
vision sensors (e.g., dynamic vision sensors (DVS)) involves incorporating
asynchronous multiply-accumulate (MAC) operations within the pixel array,
resulting in improved energy efficiency. In a CMOS implementation, low overhead
energy-efficient analog MAC accumulates charges on a passive capacitor;
however, the capacitor's limited charge retention time affects the algorithmic
integration time choices, impacting the algorithmic accuracy, bandwidth,
energy, and training efficiency. Consequently, this results in a design
trade-off on the hardware aspect-creating a need for a low-leakage compute unit
while maintaining the area and energy benefits. In this work, we present a
holistic analysis of the hardware-algorithm co-design trade-off based on the
limited integration time posed by the hardware and techniques to improve the
leakage performance of the in-pixel analog MAC operations.Comment: 6 pages, 4 figures, 1 tabl