2,084 research outputs found

    Reverberant Audio Source Separation via Sparse and Low-Rank Modeling

    Get PDF
    The performance of audio source separation from underdetermined convolutive mixture assuming known mixing filters can be significantly improved by using an analysis sparse prior optimized by a reweighting l1 scheme and a wideband datafidelity term, as demonstrated by a recent article. In this letter, we show that the performance can be improved even more significantly by exploiting a low-rank prior on the source spectrograms.We present a new algorithm to estimate the sources based on i) an analysis sparse prior, ii) a reweighting scheme so as to increase the sparsity, iii) a wideband data-fidelity term in a constrained form, and iv) a low-rank constraint on the source spectrograms. Evaluation on reverberant music mixtures shows that the resulting algorithm improves state-of-the-art methods by more than 2 dB of signal-to-distortion ratio

    Multichannel Speech Separation and Enhancement Using the Convolutive Transfer Function

    Get PDF
    This paper addresses the problem of speech separation and enhancement from multichannel convolutive and noisy mixtures, \emph{assuming known mixing filters}. We propose to perform the speech separation and enhancement task in the short-time Fourier transform domain, using the convolutive transfer function (CTF) approximation. Compared to time-domain filters, CTF has much less taps, consequently it has less near-common zeros among channels and less computational complexity. The work proposes three speech-source recovery methods, namely: i) the multichannel inverse filtering method, i.e. the multiple input/output inverse theorem (MINT), is exploited in the CTF domain, and for the multi-source case, ii) a beamforming-like multichannel inverse filtering method applying single source MINT and using power minimization, which is suitable whenever the source CTFs are not all known, and iii) a constrained Lasso method, where the sources are recovered by minimizing the ℓ1\ell_1-norm to impose their spectral sparsity, with the constraint that the ℓ2\ell_2-norm fitting cost, between the microphone signals and the mixing model involving the unknown source signals, is less than a tolerance. The noise can be reduced by setting a tolerance onto the noise power. Experiments under various acoustic conditions are carried out to evaluate the three proposed methods. The comparison between them as well as with the baseline methods is presented.Comment: Submitted to IEEE/ACM Transactions on Audio, Speech and Language Processin

    Compressive Source Separation: Theory and Methods for Hyperspectral Imaging

    Get PDF
    With the development of numbers of high resolution data acquisition systems and the global requirement to lower the energy consumption, the development of efficient sensing techniques becomes critical. Recently, Compressed Sampling (CS) techniques, which exploit the sparsity of signals, have allowed to reconstruct signal and images with less measurements than the traditional Nyquist sensing approach. However, multichannel signals like Hyperspectral images (HSI) have additional structures, like inter-channel correlations, that are not taken into account in the classical CS scheme. In this paper we exploit the linear mixture of sources model, that is the assumption that the multichannel signal is composed of a linear combination of sources, each of them having its own spectral signature, and propose new sampling schemes exploiting this model to considerably decrease the number of measurements needed for the acquisition and source separation. Moreover, we give theoretical lower bounds on the number of measurements required to perform reconstruction of both the multichannel signal and its sources. We also proposed optimization algorithms and extensive experimentation on our target application which is HSI, and show that our approach recovers HSI with far less measurements and computational effort than traditional CS approaches.Comment: 32 page

    A Primal-Dual Proximal Algorithm for Sparse Template-Based Adaptive Filtering: Application to Seismic Multiple Removal

    Get PDF
    Unveiling meaningful geophysical information from seismic data requires to deal with both random and structured "noises". As their amplitude may be greater than signals of interest (primaries), additional prior information is especially important in performing efficient signal separation. We address here the problem of multiple reflections, caused by wave-field bouncing between layers. Since only approximate models of these phenomena are available, we propose a flexible framework for time-varying adaptive filtering of seismic signals, using sparse representations, based on inaccurate templates. We recast the joint estimation of adaptive filters and primaries in a new convex variational formulation. This approach allows us to incorporate plausible knowledge about noise statistics, data sparsity and slow filter variation in parsimony-promoting wavelet frames. The designed primal-dual algorithm solves a constrained minimization problem that alleviates standard regularization issues in finding hyperparameters. The approach demonstrates significantly good performance in low signal-to-noise ratio conditions, both for simulated and real field seismic data

    Structured Sparsity Models for Multiparty Speech Recovery from Reverberant Recordings

    Get PDF
    We tackle the multi-party speech recovery problem through modeling the acoustic of the reverberant chambers. Our approach exploits structured sparsity models to perform room modeling and speech recovery. We propose a scheme for characterizing the room acoustic from the unknown competing speech sources relying on localization of the early images of the speakers by sparse approximation of the spatial spectra of the virtual sources in a free-space model. The images are then clustered exploiting the low-rank structure of the spectro-temporal components belonging to each source. This enables us to identify the early support of the room impulse response function and its unique map to the room geometry. To further tackle the ambiguity of the reflection ratios, we propose a novel formulation of the reverberation model and estimate the absorption coefficients through a convex optimization exploiting joint sparsity model formulated upon spatio-spectral sparsity of concurrent speech representation. The acoustic parameters are then incorporated for separating individual speech signals through either structured sparse recovery or inverse filtering the acoustic channels. The experiments conducted on real data recordings demonstrate the effectiveness of the proposed approach for multi-party speech recovery and recognition.Comment: 31 page

    Non-smooth Non-convex Bregman Minimization: Unification and new Algorithms

    Full text link
    We propose a unifying algorithm for non-smooth non-convex optimization. The algorithm approximates the objective function by a convex model function and finds an approximate (Bregman) proximal point of the convex model. This approximate minimizer of the model function yields a descent direction, along which the next iterate is found. Complemented with an Armijo-like line search strategy, we obtain a flexible algorithm for which we prove (subsequential) convergence to a stationary point under weak assumptions on the growth of the model function error. Special instances of the algorithm with a Euclidean distance function are, for example, Gradient Descent, Forward--Backward Splitting, ProxDescent, without the common requirement of a "Lipschitz continuous gradient". In addition, we consider a broad class of Bregman distance functions (generated by Legendre functions) replacing the Euclidean distance. The algorithm has a wide range of applications including many linear and non-linear inverse problems in signal/image processing and machine learning
    • 

    corecore