3 research outputs found
Multi-View Networks For Multi-Channel Audio Classification
In this paper we introduce the idea of multi-view networks for sound
classification with multiple sensors. We show how one can build a multi-channel
sound recognition model trained on a fixed number of channels, and deploy it to
scenarios with arbitrary (and potentially dynamically changing) number of input
channels and not observe degradation in performance. We demonstrate that at
inference time you can safely provide this model all available channels as it
can ignore noisy information and leverage new information better than standard
baseline approaches. The model is evaluated in both an anechoic environment and
in rooms generated by a room acoustics simulator. We demonstrate that this
model can generalize to unseen numbers of channels as well as unseen room
geometries.Comment: 5 pages, 7 figures, Accepted to ICASSP 201
Attention-based distributed speech enhancement for unconstrained microphone arrays with varying number of nodes
Speech enhancement promises higher efficiency in ad-hoc microphone arrays
than in constrained microphone arrays thanks to the wide spatial coverage of
the devices in the acoustic scene. However, speech enhancement in ad-hoc
microphone arrays still raises many challenges. In particular, the algorithms
should be able to handle a variable number of microphones, as some devices in
the array might appear or disappear. In this paper, we propose a solution that
can efficiently process the spatial information captured by the different
devices of the microphone array, while being robust to a link failure. To do
this, we use an attention mechanism in order to put more weight on the relevant
signals sent throughout the array and to neglect the redundant or empty
channels