2,084 research outputs found
Reverberant Audio Source Separation via Sparse and Low-Rank Modeling
The performance of audio source separation from underdetermined convolutive
mixture assuming known mixing filters can be significantly improved by using an
analysis sparse prior optimized by a reweighting l1 scheme and a wideband
datafidelity term, as demonstrated by a recent article. In this letter, we show
that the performance can be improved even more significantly by exploiting a
low-rank prior on the source spectrograms.We present a new algorithm to
estimate the sources based on i) an analysis sparse prior, ii) a reweighting
scheme so as to increase the sparsity, iii) a wideband data-fidelity term in a
constrained form, and iv) a low-rank constraint on the source spectrograms.
Evaluation on reverberant music mixtures shows that the resulting algorithm
improves state-of-the-art methods by more than 2 dB of signal-to-distortion
ratio
Multichannel Speech Separation and Enhancement Using the Convolutive Transfer Function
This paper addresses the problem of speech separation and enhancement from
multichannel convolutive and noisy mixtures, \emph{assuming known mixing
filters}. We propose to perform the speech separation and enhancement task in
the short-time Fourier transform domain, using the convolutive transfer
function (CTF) approximation. Compared to time-domain filters, CTF has much
less taps, consequently it has less near-common zeros among channels and less
computational complexity. The work proposes three speech-source recovery
methods, namely: i) the multichannel inverse filtering method, i.e. the
multiple input/output inverse theorem (MINT), is exploited in the CTF domain,
and for the multi-source case, ii) a beamforming-like multichannel inverse
filtering method applying single source MINT and using power minimization,
which is suitable whenever the source CTFs are not all known, and iii) a
constrained Lasso method, where the sources are recovered by minimizing the
-norm to impose their spectral sparsity, with the constraint that the
-norm fitting cost, between the microphone signals and the mixing model
involving the unknown source signals, is less than a tolerance. The noise can
be reduced by setting a tolerance onto the noise power. Experiments under
various acoustic conditions are carried out to evaluate the three proposed
methods. The comparison between them as well as with the baseline methods is
presented.Comment: Submitted to IEEE/ACM Transactions on Audio, Speech and Language
Processin
Compressive Source Separation: Theory and Methods for Hyperspectral Imaging
With the development of numbers of high resolution data acquisition systems
and the global requirement to lower the energy consumption, the development of
efficient sensing techniques becomes critical. Recently, Compressed Sampling
(CS) techniques, which exploit the sparsity of signals, have allowed to
reconstruct signal and images with less measurements than the traditional
Nyquist sensing approach. However, multichannel signals like Hyperspectral
images (HSI) have additional structures, like inter-channel correlations, that
are not taken into account in the classical CS scheme. In this paper we exploit
the linear mixture of sources model, that is the assumption that the
multichannel signal is composed of a linear combination of sources, each of
them having its own spectral signature, and propose new sampling schemes
exploiting this model to considerably decrease the number of measurements
needed for the acquisition and source separation. Moreover, we give theoretical
lower bounds on the number of measurements required to perform reconstruction
of both the multichannel signal and its sources. We also proposed optimization
algorithms and extensive experimentation on our target application which is
HSI, and show that our approach recovers HSI with far less measurements and
computational effort than traditional CS approaches.Comment: 32 page
A Primal-Dual Proximal Algorithm for Sparse Template-Based Adaptive Filtering: Application to Seismic Multiple Removal
Unveiling meaningful geophysical information from seismic data requires to
deal with both random and structured "noises". As their amplitude may be
greater than signals of interest (primaries), additional prior information is
especially important in performing efficient signal separation. We address here
the problem of multiple reflections, caused by wave-field bouncing between
layers. Since only approximate models of these phenomena are available, we
propose a flexible framework for time-varying adaptive filtering of seismic
signals, using sparse representations, based on inaccurate templates. We recast
the joint estimation of adaptive filters and primaries in a new convex
variational formulation. This approach allows us to incorporate plausible
knowledge about noise statistics, data sparsity and slow filter variation in
parsimony-promoting wavelet frames. The designed primal-dual algorithm solves a
constrained minimization problem that alleviates standard regularization issues
in finding hyperparameters. The approach demonstrates significantly good
performance in low signal-to-noise ratio conditions, both for simulated and
real field seismic data
Structured Sparsity Models for Multiparty Speech Recovery from Reverberant Recordings
We tackle the multi-party speech recovery problem through modeling the
acoustic of the reverberant chambers. Our approach exploits structured sparsity
models to perform room modeling and speech recovery. We propose a scheme for
characterizing the room acoustic from the unknown competing speech sources
relying on localization of the early images of the speakers by sparse
approximation of the spatial spectra of the virtual sources in a free-space
model. The images are then clustered exploiting the low-rank structure of the
spectro-temporal components belonging to each source. This enables us to
identify the early support of the room impulse response function and its unique
map to the room geometry. To further tackle the ambiguity of the reflection
ratios, we propose a novel formulation of the reverberation model and estimate
the absorption coefficients through a convex optimization exploiting joint
sparsity model formulated upon spatio-spectral sparsity of concurrent speech
representation. The acoustic parameters are then incorporated for separating
individual speech signals through either structured sparse recovery or inverse
filtering the acoustic channels. The experiments conducted on real data
recordings demonstrate the effectiveness of the proposed approach for
multi-party speech recovery and recognition.Comment: 31 page
Non-smooth Non-convex Bregman Minimization: Unification and new Algorithms
We propose a unifying algorithm for non-smooth non-convex optimization. The
algorithm approximates the objective function by a convex model function and
finds an approximate (Bregman) proximal point of the convex model. This
approximate minimizer of the model function yields a descent direction, along
which the next iterate is found. Complemented with an Armijo-like line search
strategy, we obtain a flexible algorithm for which we prove (subsequential)
convergence to a stationary point under weak assumptions on the growth of the
model function error. Special instances of the algorithm with a Euclidean
distance function are, for example, Gradient Descent, Forward--Backward
Splitting, ProxDescent, without the common requirement of a "Lipschitz
continuous gradient". In addition, we consider a broad class of Bregman
distance functions (generated by Legendre functions) replacing the Euclidean
distance. The algorithm has a wide range of applications including many linear
and non-linear inverse problems in signal/image processing and machine
learning
- âŠ