2 research outputs found
Q-SpiNN: A Framework for Quantizing Spiking Neural Networks
A prominent technique for reducing the memory footprint of Spiking Neural
Networks (SNNs) without decreasing the accuracy significantly is quantization.
However, the state-of-the-art only focus on employing the weight quantization
directly from a specific quantization scheme, i.e., either the post-training
quantization (PTQ) or the in-training quantization (ITQ), and do not consider
(1) quantizing other SNN parameters (e.g., neuron membrane potential), (2)
exploring different combinations of quantization approaches (i.e., quantization
schemes, precision levels, and rounding schemes), and (3) selecting the SNN
model with a good memory-accuracy trade-off at the end. Therefore, the memory
saving offered by these state-of-the-art to meet the targeted accuracy is
limited, thereby hindering processing SNNs on the resource-constrained systems
(e.g., the IoT-Edge devices). Towards this, we propose Q-SpiNN, a novel
quantization framework for memory-efficient SNNs. The key mechanisms of the
Q-SpiNN are: (1) employing quantization for different SNN parameters based on
their significance to the accuracy, (2) exploring different combinations of
quantization schemes, precision levels, and rounding schemes to find efficient
SNN model candidates, and (3) developing an algorithm that quantifies the
benefit of the memory-accuracy trade-off obtained by the candidates, and
selects the Pareto-optimal one. The experimental results show that, for the
unsupervised network, the Q-SpiNN reduces the memory footprint by ca. 4x, while
maintaining the accuracy within 1% from the baseline on the MNIST dataset. For
the supervised network, the Q-SpiNN reduces the memory by ca. 2x, while keeping
the accuracy within 2% from the baseline on the DVS-Gesture dataset.Comment: Accepted for publication at the 2021 International Joint Conference
on Neural Networks (IJCNN), July 2021, Virtual Even
FSpiNN: An Optimization Framework for Memory- and Energy-Efficient Spiking Neural Networks
Spiking Neural Networks (SNNs) are gaining interest due to their event-driven
processing which potentially consumes low power/energy computations in hardware
platforms, while offering unsupervised learning capability due to the
spike-timing-dependent plasticity (STDP) rule. However, state-of-the-art SNNs
require a large memory footprint to achieve high accuracy, thereby making them
difficult to be deployed on embedded systems, for instance on battery-powered
mobile devices and IoT Edge nodes. Towards this, we propose FSpiNN, an
optimization framework for obtaining memory- and energy-efficient SNNs for
training and inference processing, with unsupervised learning capability while
maintaining accuracy. It is achieved by (1) reducing the computational
requirements of neuronal and STDP operations, (2) improving the accuracy of
STDP-based learning, (3) compressing the SNN through a fixed-point
quantization, and (4) incorporating the memory and energy requirements in the
optimization process. FSpiNN reduces the computational requirements by reducing
the number of neuronal operations, the STDP-based synaptic weight updates, and
the STDP complexity. To improve the accuracy of learning, FSpiNN employs
timestep-based synaptic weight updates, and adaptively determines the STDP
potentiation factor and the effective inhibition strength. The experimental
results show that, as compared to the state-of-the-art work, FSpiNN achieves
7.5x memory saving, and improves the energy-efficiency by 3.5x on average for
training and by 1.8x on average for inference, across MNIST and Fashion MNIST
datasets, with no accuracy loss for a network with 4900 excitatory neurons,
thereby enabling energy-efficient SNNs for edge devices/embedded systems.Comment: To appear at the IEEE Transactions on Computer-Aided Design of
Integrated Circuits and Systems (IEEE-TCAD), as part of the ESWEEK-TCAD
Special Issue, September 202