Recent advances in Voice Activity Detection (VAD) are driven by artificial
and Recurrent Neural Networks (RNNs), however, using a VAD system in
battery-operated devices requires further power efficiency. This can be
achieved by neuromorphic hardware, which enables Spiking Neural Networks (SNNs)
to perform inference at very low energy consumption. Spiking networks are
characterized by their ability to process information efficiently, in a sparse
cascade of binary events in time called spikes. However, a big performance gap
separates artificial from spiking networks, mostly due to a lack of powerful
SNN training algorithms. To overcome this problem we exploit an SNN model that
can be recast into an RNN-like model and trained with known deep learning
techniques. We describe an SNN training procedure that achieves low spiking
activity and pruning algorithms to remove 85% of the network connections with
no performance loss. The model achieves state-of-the-art performance with a
fraction of power consumption comparing to other methods.Comment: 5 pages, 2 figures, 2 table