933 research outputs found

    Probabilistic Modeling Paradigms for Audio Source Separation

    Get PDF
    This is the author's final version of the article, first published as E. Vincent, M. G. Jafari, S. A. Abdallah, M. D. Plumbley, M. E. Davies. Probabilistic Modeling Paradigms for Audio Source Separation. In W. Wang (Ed), Machine Audition: Principles, Algorithms and Systems. Chapter 7, pp. 162-185. IGI Global, 2011. ISBN 978-1-61520-919-4. DOI: 10.4018/978-1-61520-919-4.ch007file: VincentJafariAbdallahPD11-probabilistic.pdf:v\VincentJafariAbdallahPD11-probabilistic.pdf:PDF owner: markp timestamp: 2011.02.04file: VincentJafariAbdallahPD11-probabilistic.pdf:v\VincentJafariAbdallahPD11-probabilistic.pdf:PDF owner: markp timestamp: 2011.02.04Most sound scenes result from the superposition of several sources, which can be separately perceived and analyzed by human listeners. Source separation aims to provide machine listeners with similar skills by extracting the sounds of individual sources from a given scene. Existing separation systems operate either by emulating the human auditory system or by inferring the parameters of probabilistic sound models. In this chapter, the authors focus on the latter approach and provide a joint overview of established and recent models, including independent component analysis, local time-frequency models and spectral template-based models. They show that most models are instances of one of the following two general paradigms: linear modeling or variance modeling. They compare the merits of either paradigm and report objective performance figures. They also,conclude by discussing promising combinations of probabilistic priors and inference algorithms that could form the basis of future state-of-the-art systems

    Technical report on Separation methods for nonlinear mixtures

    Get PDF

    Acoustic Space Learning for Sound Source Separation and Localization on Binaural Manifolds

    Get PDF
    In this paper we address the problems of modeling the acoustic space generated by a full-spectrum sound source and of using the learned model for the localization and separation of multiple sources that simultaneously emit sparse-spectrum sounds. We lay theoretical and methodological grounds in order to introduce the binaural manifold paradigm. We perform an in-depth study of the latent low-dimensional structure of the high-dimensional interaural spectral data, based on a corpus recorded with a human-like audiomotor robot head. A non-linear dimensionality reduction technique is used to show that these data lie on a two-dimensional (2D) smooth manifold parameterized by the motor states of the listener, or equivalently, the sound source directions. We propose a probabilistic piecewise affine mapping model (PPAM) specifically designed to deal with high-dimensional data exhibiting an intrinsic piecewise linear structure. We derive a closed-form expectation-maximization (EM) procedure for estimating the model parameters, followed by Bayes inversion for obtaining the full posterior density function of a sound source direction. We extend this solution to deal with missing data and redundancy in real world spectrograms, and hence for 2D localization of natural sound sources such as speech. We further generalize the model to the challenging case of multiple sound sources and we propose a variational EM framework. The associated algorithm, referred to as variational EM for source separation and localization (VESSL) yields a Bayesian estimation of the 2D locations and time-frequency masks of all the sources. Comparisons of the proposed approach with several existing methods reveal that the combination of acoustic-space learning with Bayesian inference enables our method to outperform state-of-the-art methods.Comment: 19 pages, 9 figures, 3 table

    Decorrelation of Neutral Vector Variables: Theory and Applications

    Full text link
    In this paper, we propose novel strategies for neutral vector variable decorrelation. Two fundamental invertible transformations, namely serial nonlinear transformation and parallel nonlinear transformation, are proposed to carry out the decorrelation. For a neutral vector variable, which is not multivariate Gaussian distributed, the conventional principal component analysis (PCA) cannot yield mutually independent scalar variables. With the two proposed transformations, a highly negatively correlated neutral vector can be transformed to a set of mutually independent scalar variables with the same degrees of freedom. We also evaluate the decorrelation performances for the vectors generated from a single Dirichlet distribution and a mixture of Dirichlet distributions. The mutual independence is verified with the distance correlation measurement. The advantages of the proposed decorrelation strategies are intensively studied and demonstrated with synthesized data and practical application evaluations

    Blind separation of underdetermined mixtures with additive white and pink noises

    Get PDF
    This paper presents an approach for underdetermined blind source separation in the case of additive Gaussian white noise and pink noise. Likewise, the proposed approach is applicable in the case of separating I + 3 sources from I mixtures with additive two kinds of noises. This situation is more challenging and suitable to practical real world problems. Moreover, unlike to some conventional approaches, the sparsity conditions are not imposed. Firstly, the mixing matrix is estimated based on an algorithm that combines short time Fourier transform and rough-fuzzy clustering. Then, the mixed signals are normalized and the source signals are recovered using modified Gradient descent Local Hierarchical Alternating Least Squares Algorithm exploiting the mixing matrix obtained from the previous step as an input and initialized by multiplicative algorithm for matrix factorization based on alpha divergence. The experiments and simulation results show that the proposed approach can separate I + 3 source signals from I mixed signals, and it has superior evaluation performance compared to some conventional approaches
    • …
    corecore