6,374 research outputs found

    Sparse BSS in the presence of outliers

    Get PDF
    submitted to SPARS15—While real-world data are often grossly corrupted, most techniques of blind source separation (BSS) give erroneous results in the presence of outliers. We propose a robust algorithm that jointly estimates the sparse sources and outliers without requiring any prior knowledge on the outliers. More precisely, it uses an alternative weighted scheme to weaken the influence of the estimated outliers. A preliminary experiment is presented and demonstrates the advantage of the proposed algorithm in comparison with state-of-the-art BSS methods. I. PROBLEM FORMULATION Suppose we are given m noisy observations {Xi} i=1..m of unknown linear mixtures of n ≤ m sparse sources {Sj} j=1..n with t > m samples. It is generally assumed that these data are corrupted by a Gaussian noise, accounting for instrumental or model imperfections. However in many applications, some entries are additionally corrupted by outliers, leading to the following model: X = AS + O + N, with X the observations, A the mixing matrix, S the sources, O the outliers, and N the Gaussian noise. In the presence of outliers, the key difficulty lies in separating the components O and AS. To this end, assuming that the term AS has low-rank, some strategies [4] suggest to pre-process the data to estimate and remove the outliers with RPCA [3]. However, besides the fact that low-rankness is generally restrictive for most BSS problems, the source separation is severely hampered if the outliers are not well estimated. Therefore, we introduce a method that estimates the sources in the presence of the outliers without pre-processing. For the best of our knowledge, it has only been studied in [5] by using the β-divergence. Unlike [5], we propose to estimate jointly the outliers and the sources by exploiting their sparsity. II. ALGORITH

    Direction of Arrival with One Microphone, a few LEGOs, and Non-Negative Matrix Factorization

    Get PDF
    Conventional approaches to sound source localization require at least two microphones. It is known, however, that people with unilateral hearing loss can also localize sounds. Monaural localization is possible thanks to the scattering by the head, though it hinges on learning the spectra of the various sources. We take inspiration from this human ability to propose algorithms for accurate sound source localization using a single microphone embedded in an arbitrary scattering structure. The structure modifies the frequency response of the microphone in a direction-dependent way giving each direction a signature. While knowing those signatures is sufficient to localize sources of white noise, localizing speech is much more challenging: it is an ill-posed inverse problem which we regularize by prior knowledge in the form of learned non-negative dictionaries. We demonstrate a monaural speech localization algorithm based on non-negative matrix factorization that does not depend on sophisticated, designed scatterers. In fact, we show experimental results with ad hoc scatterers made of LEGO bricks. Even with these rudimentary structures we can accurately localize arbitrary speakers; that is, we do not need to learn the dictionary for the particular speaker to be localized. Finally, we discuss multi-source localization and the related limitations of our approach.Comment: This article has been accepted for publication in IEEE/ACM Transactions on Audio, Speech, and Language processing (TASLP

    Shift-Invariant Kernel Additive Modelling for Audio Source Separation

    Full text link
    A major goal in blind source separation to identify and separate sources is to model their inherent characteristics. While most state-of-the-art approaches are supervised methods trained on large datasets, interest in non-data-driven approaches such as Kernel Additive Modelling (KAM) remains high due to their interpretability and adaptability. KAM performs the separation of a given source applying robust statistics on the time-frequency bins selected by a source-specific kernel function, commonly the K-NN function. This choice assumes that the source of interest repeats in both time and frequency. In practice, this assumption does not always hold. Therefore, we introduce a shift-invariant kernel function capable of identifying similar spectral content even under frequency shifts. This way, we can considerably increase the amount of suitable sound material available to the robust statistics. While this leads to an increase in separation performance, a basic formulation, however, is computationally expensive. Therefore, we additionally present acceleration techniques that lower the overall computational complexity.Comment: Feedback is welcom

    A Semi-Blind Source Separation Method for Differential Optical Absorption Spectroscopy of Atmospheric Gas Mixtures

    Full text link
    Differential optical absorption spectroscopy (DOAS) is a powerful tool for detecting and quantifying trace gases in atmospheric chemistry \cite{Platt_Stutz08}. DOAS spectra consist of a linear combination of complex multi-peak multi-scale structures. Most DOAS analysis routines in use today are based on least squares techniques, for example, the approach developed in the 1970s uses polynomial fits to remove a slowly varying background, and known reference spectra to retrieve the identity and concentrations of reference gases. An open problem is to identify unknown gases in the fitting residuals for complex atmospheric mixtures. In this work, we develop a novel three step semi-blind source separation method. The first step uses a multi-resolution analysis to remove the slow-varying and fast-varying components in the DOAS spectral data matrix XX. The second step decomposes the preprocessed data X^\hat{X} in the first step into a linear combination of the reference spectra plus a remainder, or X^=AS+R\hat{X} = A\,S + R, where columns of matrix AA are known reference spectra, and the matrix SS contains the unknown non-negative coefficients that are proportional to concentration. The second step is realized by a convex minimization problem S=argminnorm(X^AS)S = \mathrm{arg} \min \mathrm{norm}\,(\hat{X} - A\,S), where the norm is a hybrid 1/2\ell_1/\ell_2 norm (Huber estimator) that helps to maintain the non-negativity of SS. The third step performs a blind independent component analysis of the remainder matrix RR to extract remnant gas components. We first illustrate the proposed method in processing a set of DOAS experimental data by a satisfactory blind extraction of an a-priori unknown trace gas (ozone) from the remainder matrix. Numerical results also show that the method can identify multiple trace gases from the residuals.Comment: submitted to Journal of Scientific Computin

    Joint Tensor Factorization and Outlying Slab Suppression with Applications

    Full text link
    We consider factoring low-rank tensors in the presence of outlying slabs. This problem is important in practice, because data collected in many real-world applications, such as speech, fluorescence, and some social network data, fit this paradigm. Prior work tackles this problem by iteratively selecting a fixed number of slabs and fitting, a procedure which may not converge. We formulate this problem from a group-sparsity promoting point of view, and propose an alternating optimization framework to handle the corresponding p\ell_p (0<p10<p\leq 1) minimization-based low-rank tensor factorization problem. The proposed algorithm features a similar per-iteration complexity as the plain trilinear alternating least squares (TALS) algorithm. Convergence of the proposed algorithm is also easy to analyze under the framework of alternating optimization and its variants. In addition, regularization and constraints can be easily incorporated to make use of \emph{a priori} information on the latent loading factors. Simulations and real data experiments on blind speech separation, fluorescence data analysis, and social network mining are used to showcase the effectiveness of the proposed algorithm

    Acoustic Space Learning for Sound Source Separation and Localization on Binaural Manifolds

    Get PDF
    In this paper we address the problems of modeling the acoustic space generated by a full-spectrum sound source and of using the learned model for the localization and separation of multiple sources that simultaneously emit sparse-spectrum sounds. We lay theoretical and methodological grounds in order to introduce the binaural manifold paradigm. We perform an in-depth study of the latent low-dimensional structure of the high-dimensional interaural spectral data, based on a corpus recorded with a human-like audiomotor robot head. A non-linear dimensionality reduction technique is used to show that these data lie on a two-dimensional (2D) smooth manifold parameterized by the motor states of the listener, or equivalently, the sound source directions. We propose a probabilistic piecewise affine mapping model (PPAM) specifically designed to deal with high-dimensional data exhibiting an intrinsic piecewise linear structure. We derive a closed-form expectation-maximization (EM) procedure for estimating the model parameters, followed by Bayes inversion for obtaining the full posterior density function of a sound source direction. We extend this solution to deal with missing data and redundancy in real world spectrograms, and hence for 2D localization of natural sound sources such as speech. We further generalize the model to the challenging case of multiple sound sources and we propose a variational EM framework. The associated algorithm, referred to as variational EM for source separation and localization (VESSL) yields a Bayesian estimation of the 2D locations and time-frequency masks of all the sources. Comparisons of the proposed approach with several existing methods reveal that the combination of acoustic-space learning with Bayesian inference enables our method to outperform state-of-the-art methods.Comment: 19 pages, 9 figures, 3 table

    Dictionary Learning of Convolved Signals

    Get PDF
    Assuming that a set of source signals is sparsely representable in a given dictionary, we show how their sparse recovery fails whenever we can only measure a convolved observation of them. Starting from this motivation, we develop a block coordinate descent method which aims to learn a convolved dictionary and provide a sparse representation of the observed signals with small residual norm. We compare the proposed approach to the K-SVD dictionary learning algorithm and show through numerical experiment on synthetic signals that, provided some conditions on the problem data, our technique converges in a fixed number of iterations to a sparse representation with smaller residual norm
    corecore