437 research outputs found
A Bayesian Network View on Acoustic Model-Based Techniques for Robust Speech Recognition
This article provides a unifying Bayesian network view on various approaches
for acoustic model adaptation, missing feature, and uncertainty decoding that
are well-known in the literature of robust automatic speech recognition. The
representatives of these classes can often be deduced from a Bayesian network
that extends the conventional hidden Markov models used in speech recognition.
These extensions, in turn, can in many cases be motivated from an underlying
observation model that relates clean and distorted feature vectors. By
converting the observation models into a Bayesian network representation, we
formulate the corresponding compensation rules leading to a unified view on
known derivations as well as to new formulations for certain approaches. The
generic Bayesian perspective provided in this contribution thus highlights
structural differences and similarities between the analyzed approaches
PSD Estimation and Source Separation in a Noisy Reverberant Environment using a Spherical Microphone Array
In this paper, we propose an efficient technique for estimating individual
power spectral density (PSD) components, i.e., PSD of each desired sound source
as well as of noise and reverberation, in a multi-source reverberant sound
scene with coherent background noise. We formulate the problem in the spherical
harmonics domain to take the advantage of the inherent orthogonality of the
spherical harmonics basis functions and extract the PSD components from the
cross-correlation between the different sound field modes. We also investigate
an implementation issue that occurs at the nulls of the Bessel functions and
offer an engineering solution. The performance evaluation takes place in a
practical environment with a commercial microphone array in order to measure
the robustness of the proposed algorithm against all the deviations incurred in
practice. We also exhibit an application of the proposed PSD estimator through
a source septation algorithm and compare the performance with a contemporary
method in terms of different objective measures
- …