Spiking Neural Networks (SNNs) have recently attracted widespread research
interest as an efficient alternative to traditional Artificial Neural Networks
(ANNs) because of their capability to process sparse and binary spike
information and avoid expensive multiplication operations. Although the
efficiency of SNNs can be realized on the In-Memory Computing (IMC)
architecture, we show that the energy cost and latency of SNNs scale linearly
with the number of timesteps used on IMC hardware. Therefore, in order to
maximize the efficiency of SNNs, we propose input-aware Dynamic Timestep SNN
(DT-SNN), a novel algorithmic solution to dynamically determine the number of
timesteps during inference on an input-dependent basis. By calculating the
entropy of the accumulated output after each timestep, we can compare it to a
predefined threshold and decide if the information processed at the current
timestep is sufficient for a confident prediction. We deploy DT-SNN on an IMC
architecture and show that it incurs negligible computational overhead. We
demonstrate that our method only uses 1.46 average timesteps to achieve the
accuracy of a 4-timestep static SNN while reducing the energy-delay-product by
80%.Comment: Published at Design & Automation Conferences (DAC) 202