809 research outputs found
Semi-Supervised Sound Source Localization Based on Manifold Regularization
Conventional speaker localization algorithms, based merely on the received
microphone signals, are often sensitive to adverse conditions, such as: high
reverberation or low signal to noise ratio (SNR). In some scenarios, e.g. in
meeting rooms or cars, it can be assumed that the source position is confined
to a predefined area, and the acoustic parameters of the environment are
approximately fixed. Such scenarios give rise to the assumption that the
acoustic samples from the region of interest have a distinct geometrical
structure. In this paper, we show that the high dimensional acoustic samples
indeed lie on a low dimensional manifold and can be embedded into a low
dimensional space. Motivated by this result, we propose a semi-supervised
source localization algorithm which recovers the inverse mapping between the
acoustic samples and their corresponding locations. The idea is to use an
optimization framework based on manifold regularization, that involves
smoothness constraints of possible solutions with respect to the manifold. The
proposed algorithm, termed Manifold Regularization for Localization (MRL), is
implemented in an adaptive manner. The initialization is conducted with only
few labelled samples attached with their respective source locations, and then
the system is gradually adapted as new unlabelled samples (with unknown source
locations) are received. Experimental results show superior localization
performance when compared with a recently presented algorithm based on a
manifold learning approach and with the generalized cross-correlation (GCC)
algorithm as a baseline
Spatial Diffuseness Features for DNN-Based Speech Recognition in Noisy and Reverberant Environments
We propose a spatial diffuseness feature for deep neural network (DNN)-based
automatic speech recognition to improve recognition accuracy in reverberant and
noisy environments. The feature is computed in real-time from multiple
microphone signals without requiring knowledge or estimation of the direction
of arrival, and represents the relative amount of diffuse noise in each time
and frequency bin. It is shown that using the diffuseness feature as an
additional input to a DNN-based acoustic model leads to a reduced word error
rate for the REVERB challenge corpus, both compared to logmelspec features
extracted from noisy signals, and features enhanced by spectral subtraction.Comment: accepted for ICASSP201
Speech Dereverberation Based on Integrated Deep and Ensemble Learning Algorithm
Reverberation, which is generally caused by sound reflections from walls,
ceilings, and floors, can result in severe performance degradation of acoustic
applications. Due to a complicated combination of attenuation and time-delay
effects, the reverberation property is difficult to characterize, and it
remains a challenging task to effectively retrieve the anechoic speech signals
from reverberation ones. In the present study, we proposed a novel integrated
deep and ensemble learning algorithm (IDEA) for speech dereverberation. The
IDEA consists of offline and online phases. In the offline phase, we train
multiple dereverberation models, each aiming to precisely dereverb speech
signals in a particular acoustic environment; then a unified fusion function is
estimated that aims to integrate the information of multiple dereverberation
models. In the online phase, an input utterance is first processed by each of
the dereverberation models. The outputs of all models are integrated
accordingly to generate the final anechoic signal. We evaluated the IDEA on
designed acoustic environments, including both matched and mismatched
conditions of the training and testing data. Experimental results confirm that
the proposed IDEA outperforms single deep-neural-network-based dereverberation
model with the same model architecture and training data
- …