1,273 research outputs found

    Structured Sparsity Models for Multiparty Speech Recovery from Reverberant Recordings

    Get PDF
    We tackle the multi-party speech recovery problem through modeling the acoustic of the reverberant chambers. Our approach exploits structured sparsity models to perform room modeling and speech recovery. We propose a scheme for characterizing the room acoustic from the unknown competing speech sources relying on localization of the early images of the speakers by sparse approximation of the spatial spectra of the virtual sources in a free-space model. The images are then clustered exploiting the low-rank structure of the spectro-temporal components belonging to each source. This enables us to identify the early support of the room impulse response function and its unique map to the room geometry. To further tackle the ambiguity of the reflection ratios, we propose a novel formulation of the reverberation model and estimate the absorption coefficients through a convex optimization exploiting joint sparsity model formulated upon spatio-spectral sparsity of concurrent speech representation. The acoustic parameters are then incorporated for separating individual speech signals through either structured sparse recovery or inverse filtering the acoustic channels. The experiments conducted on real data recordings demonstrate the effectiveness of the proposed approach for multi-party speech recovery and recognition.Comment: 31 page

    Blind separation of underdetermined mixtures with additive white and pink noises

    Get PDF
    This paper presents an approach for underdetermined blind source separation in the case of additive Gaussian white noise and pink noise. Likewise, the proposed approach is applicable in the case of separating I + 3 sources from I mixtures with additive two kinds of noises. This situation is more challenging and suitable to practical real world problems. Moreover, unlike to some conventional approaches, the sparsity conditions are not imposed. Firstly, the mixing matrix is estimated based on an algorithm that combines short time Fourier transform and rough-fuzzy clustering. Then, the mixed signals are normalized and the source signals are recovered using modified Gradient descent Local Hierarchical Alternating Least Squares Algorithm exploiting the mixing matrix obtained from the previous step as an input and initialized by multiplicative algorithm for matrix factorization based on alpha divergence. The experiments and simulation results show that the proposed approach can separate I + 3 source signals from I mixed signals, and it has superior evaluation performance compared to some conventional approaches

    Exploitation of source nonstationarity in underdetermined blind source separation with advanced clustering techniques

    Get PDF
    The problem of blind source separation (BSS) is investigated. Following the assumption that the time-frequency (TF) distributions of the input sources do not overlap, quadratic TF representation is used to exploit the sparsity of the statistically nonstationary sources. However, separation performance is shown to be limited by the selection of a certain threshold in classifying the eigenvectors of the TF matrices drawn from the observation mixtures. Two methods are, therefore, proposed based on recently introduced advanced clustering techniques, namely Gap statistics and self-splitting competitive learning (SSCL), to mitigate the problem of eigenvector classification. The novel integration of these two approaches successfully overcomes the problem of artificial sources induced by insufficient knowledge of the threshold and enables automatic determination of the number of active sources over the observation. The separation performance is thereby greatly improved. Practical consequences of violating the TF orthogonality assumption in the current approach are also studied, which motivates the proposal of a new solution robust to violation of orthogonality. In this new method, the TF plane is partitioned into appropriate blocks and source separation is thereby carried out in a block-by-block manner. Numerical experiments with linear chirp signals and Gaussian minimum shift keying (GMSK) signals are included which support the improved performance of the proposed approaches

    Image Decomposition and Separation Using Sparse Representations: An Overview

    Get PDF
    This paper gives essential insights into the use of sparsity and morphological diversity in image decomposition and source separation by reviewing our recent work in this field. The idea to morphologically decompose a signal into its building blocks is an important problem in signal processing and has far-reaching applications in science and technology. Starck , proposed a novel decomposition method—morphological component analysis (MCA)—based on sparse representation of signals. MCA assumes that each (monochannel) signal is the linear mixture of several layers, the so-called morphological components, that are morphologically distinct, e.g., sines and bumps. The success of this method relies on two tenets: sparsity and morphological diversity. That is, each morphological component is sparsely represented in a specific transform domain, and the latter is highly inefficient in representing the other content in the mixture. Once such transforms are identified, MCA is an iterative thresholding algorithm that is capable of decoupling the signal content. Sparsity and morphological diversity have also been used as a novel and effective source of diversity for blind source separation (BSS), hence extending the MCA to multichannel data. Building on these ingredients, we will provide an overview the generalized MCA introduced by the authors in and as a fast and efficient BSS method. We will illustrate the application of these algorithms on several real examples. We conclude our tour by briefly describing our software toolboxes made available for download on the Internet for sparse signal and image decomposition and separation

    Weakly Supervised Audio Source Separation via Spectrum Energy Preserved Wasserstein Learning

    Full text link
    Separating audio mixtures into individual instrument tracks has been a long standing challenging task. We introduce a novel weakly supervised audio source separation approach based on deep adversarial learning. Specifically, our loss function adopts the Wasserstein distance which directly measures the distribution distance between the separated sources and the real sources for each individual source. Moreover, a global regularization term is added to fulfill the spectrum energy preservation property regardless separation. Unlike state-of-the-art weakly supervised models which often involve deliberately devised constraints or careful model selection, our approach need little prior model specification on the data, and can be straightforwardly learned in an end-to-end fashion. We show that the proposed method performs competitively on public benchmark against state-of-the-art weakly supervised methods
    • …
    corecore