1 research outputs found
Deep Ad-hoc Beamforming
Far-field speech processing is an important and challenging problem. In this
paper, we propose \textit{deep ad-hoc beamforming}, a deep-learning-based
multichannel speech enhancement framework based on ad-hoc microphone arrays, to
address the problem. It contains three novel components. First, it combines
\textit{ad-hoc microphone arrays} with deep-learning-based multichannel speech
enhancement, which reduces the probability of the occurrence of far-field
acoustic environments significantly. Second, it groups the microphones around
the speech source to a local microphone array by a supervised channel selection
framework based on deep neural networks. Third, it develops a simple time
synchronization framework to synchronize the channels that have different time
delay. Besides the above novelties and advantages, the proposed model is also
trained in a single-channel fashion, so that it can easily employ new
development of speech processing techniques. Its test stage is also flexible in
incorporating any number of microphones without retraining or modifying the
framework. We have developed many implementations of the proposed framework and
conducted an extensive experiment in scenarios where the locations of the
speech sources are far-field, random, and blind to the microphones. Results on
speech enhancement tasks show that our method outperforms its counterpart that
works with linear microphone arrays by a considerable margin in both diffuse
noise reverberant environments and point source noise reverberant environments