The reservoir computing neural network architecture is widely used to test
hardware systems for neuromorphic computing. One of the preferred tasks for
bench-marking such devices is automatic speech recognition. However, this task
requires acoustic transformations from sound waveforms with varying amplitudes
to frequency domain maps that can be seen as feature extraction techniques.
Depending on the conversion method, these may obscure the contribution of the
neuromorphic hardware to the overall speech recognition performance. Here, we
quantify and separate the contributions of the acoustic transformations and the
neuromorphic hardware to the speech recognition success rate. We show that the
non-linearity in the acoustic transformation plays a critical role in feature
extraction. We compute the gain in word success rate provided by a reservoir
computing device compared to the acoustic transformation only, and show that it
is an appropriate benchmark for comparing different hardware. Finally, we
experimentally and numerically quantify the impact of the different acoustic
transformations for neuromorphic hardware based on magnetic nano-oscillators.Comment: 13 pages, 5 figure