543 research outputs found
An adaptive stereo basis method for convolutive blind audio source separation
NOTICE: this is the author’s version of a work that was accepted for publication in Neurocomputing. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in PUBLICATION, [71, 10-12, June 2008] DOI:neucom.2007.08.02
Reverberant Audio Source Separation via Sparse and Low-Rank Modeling
The performance of audio source separation from underdetermined convolutive
mixture assuming known mixing filters can be significantly improved by using an
analysis sparse prior optimized by a reweighting l1 scheme and a wideband
datafidelity term, as demonstrated by a recent article. In this letter, we show
that the performance can be improved even more significantly by exploiting a
low-rank prior on the source spectrograms.We present a new algorithm to
estimate the sources based on i) an analysis sparse prior, ii) a reweighting
scheme so as to increase the sparsity, iii) a wideband data-fidelity term in a
constrained form, and iv) a low-rank constraint on the source spectrograms.
Evaluation on reverberant music mixtures shows that the resulting algorithm
improves state-of-the-art methods by more than 2 dB of signal-to-distortion
ratio
Structured Sparsity Models for Multiparty Speech Recovery from Reverberant Recordings
We tackle the multi-party speech recovery problem through modeling the
acoustic of the reverberant chambers. Our approach exploits structured sparsity
models to perform room modeling and speech recovery. We propose a scheme for
characterizing the room acoustic from the unknown competing speech sources
relying on localization of the early images of the speakers by sparse
approximation of the spatial spectra of the virtual sources in a free-space
model. The images are then clustered exploiting the low-rank structure of the
spectro-temporal components belonging to each source. This enables us to
identify the early support of the room impulse response function and its unique
map to the room geometry. To further tackle the ambiguity of the reflection
ratios, we propose a novel formulation of the reverberation model and estimate
the absorption coefficients through a convex optimization exploiting joint
sparsity model formulated upon spatio-spectral sparsity of concurrent speech
representation. The acoustic parameters are then incorporated for separating
individual speech signals through either structured sparse recovery or inverse
filtering the acoustic channels. The experiments conducted on real data
recordings demonstrate the effectiveness of the proposed approach for
multi-party speech recovery and recognition.Comment: 31 page
- …