10,898 research outputs found

    Probabilistic Modeling Paradigms for Audio Source Separation

    Get PDF
    This is the author's final version of the article, first published as E. Vincent, M. G. Jafari, S. A. Abdallah, M. D. Plumbley, M. E. Davies. Probabilistic Modeling Paradigms for Audio Source Separation. In W. Wang (Ed), Machine Audition: Principles, Algorithms and Systems. Chapter 7, pp. 162-185. IGI Global, 2011. ISBN 978-1-61520-919-4. DOI: 10.4018/978-1-61520-919-4.ch007file: VincentJafariAbdallahPD11-probabilistic.pdf:v\VincentJafariAbdallahPD11-probabilistic.pdf:PDF owner: markp timestamp: 2011.02.04file: VincentJafariAbdallahPD11-probabilistic.pdf:v\VincentJafariAbdallahPD11-probabilistic.pdf:PDF owner: markp timestamp: 2011.02.04Most sound scenes result from the superposition of several sources, which can be separately perceived and analyzed by human listeners. Source separation aims to provide machine listeners with similar skills by extracting the sounds of individual sources from a given scene. Existing separation systems operate either by emulating the human auditory system or by inferring the parameters of probabilistic sound models. In this chapter, the authors focus on the latter approach and provide a joint overview of established and recent models, including independent component analysis, local time-frequency models and spectral template-based models. They show that most models are instances of one of the following two general paradigms: linear modeling or variance modeling. They compare the merits of either paradigm and report objective performance figures. They also,conclude by discussing promising combinations of probabilistic priors and inference algorithms that could form the basis of future state-of-the-art systems

    Differential fast fixed-point algorithms for underdetermined instantaneous and convolutive partial blind source separation

    Full text link
    This paper concerns underdetermined linear instantaneous and convolutive blind source separation (BSS), i.e., the case when the number of observed mixed signals is lower than the number of sources.We propose partial BSS methods, which separate supposedly nonstationary sources of interest (while keeping residual components for the other, supposedly stationary, "noise" sources). These methods are based on the general differential BSS concept that we introduced before. In the instantaneous case, the approach proposed in this paper consists of a differential extension of the FastICA method (which does not apply to underdetermined mixtures). In the convolutive case, we extend our recent time-domain fast fixed-point C-FICA algorithm to underdetermined mixtures. Both proposed approaches thus keep the attractive features of the FastICA and C-FICA methods. Our approaches are based on differential sphering processes, followed by the optimization of the differential nonnormalized kurtosis that we introduce in this paper. Experimental tests show that these differential algorithms are much more robust to noise sources than the standard FastICA and C-FICA algorithms.Comment: this paper describes our differential FastICA-like algorithms for linear instantaneous and convolutive underdetermined mixture

    Audio Source Separation Using Sparse Representations

    Get PDF
    This is the author's final version of the article, first published as A. Nesbit, M. G. Jafari, E. Vincent and M. D. Plumbley. Audio Source Separation Using Sparse Representations. In W. Wang (Ed), Machine Audition: Principles, Algorithms and Systems. Chapter 10, pp. 246-264. IGI Global, 2011. ISBN 978-1-61520-919-4. DOI: 10.4018/978-1-61520-919-4.ch010file: NesbitJafariVincentP11-audio.pdf:n\NesbitJafariVincentP11-audio.pdf:PDF owner: markp timestamp: 2011.02.04file: NesbitJafariVincentP11-audio.pdf:n\NesbitJafariVincentP11-audio.pdf:PDF owner: markp timestamp: 2011.02.04The authors address the problem of audio source separation, namely, the recovery of audio signals from recordings of mixtures of those signals. The sparse component analysis framework is a powerful method for achieving this. Sparse orthogonal transforms, in which only few transform coefficients differ significantly from zero, are developed; once the signal has been transformed, energy is apportioned from each transform coefficient to each estimated source, and, finally, the signal is reconstructed using the inverse transform. The overriding aim of this chapter is to demonstrate how this framework, as exemplified here by two different decomposition methods which adapt to the signal to represent it sparsely, can be used to solve different problems in different mixing scenarios. To address the instantaneous (neither delays nor echoes) and underdetermined (more sources than mixtures) mixing model, a lapped orthogonal transform is adapted to the signal by selecting a basis from a library of predetermined bases. This method is highly related to the windowing methods used in the MPEG audio coding framework. In considering the anechoic (delays but no echoes) and determined (equal number of sources and mixtures) mixing case, a greedy adaptive transform is used based on orthogonal basis functions that are learned from the observed data, instead of being selected from a predetermined library of bases. This is found to encode the signal characteristics, by introducing a feedback system between the bases and the observed data. Experiments on mixtures of speech and music signals demonstrate that these methods give good signal approximations and separation performance, and indicate promising directions for future research

    Underdetermined source separation using a sparse STFT framework and weighted laplacian directional modelling

    Full text link
    The instantaneous underdetermined audio source separation problem of K-sensors, L-sources mixing scenario (where K < L) has been addressed by many different approaches, provided the sources remain quite distinct in the virtual positioning space spanned by the sensors. This problem can be tackled as a directional clustering problem along the source position angles in the mixture. The use of Generalised Directional Laplacian Densities (DLD) in the MDCT domain for underdetermined source separation has been proposed before. Here, we derive weighted mixtures of DLDs in a sparser representation of the data in the STFT domain to perform separation. The proposed approach yields improved results compared to our previous offering and compares favourably with the state-of-the-art.Comment: EUSIPCO 2016, Budapest, Hungar

    Of `Cocktail Parties' and Exoplanets

    Full text link
    The characterisation of ever smaller and fainter extrasolar planets requires an intricate understanding of one's data and the analysis techniques used. Correcting the raw data at the 10^-4 level of accuracy in flux is one of the central challenges. This can be difficult for instruments that do not feature a calibration plan for such high precision measurements. Here, it is not always obvious how to de-correlate the data using auxiliary information of the instrument and it becomes paramount to know how well one can disentangle instrument systematics from one's data, given nothing but the data itself. We propose a non-parametric machine learning algorithm, based on the concept of independent component analysis, to de-convolve the systematic noise and all non-Gaussian signals from the desired astrophysical signal. Such a `blind' signal de-mixing is commonly known as the `Cocktail Party problem' in signal-processing. Given multiple simultaneous observations of the same exoplanetary eclipse, as in the case of spectrophotometry, we show that we can often disentangle systematic noise from the original light curve signal without the use of any complementary information of the instrument. In this paper, we explore these signal extraction techniques using simulated data and two data sets observed with the Hubble-NICMOS instrument. Another important application is the de-correlation of the exoplanetary signal from time-correlated stellar variability. Using data obtained by the Kepler mission we show that the desired signal can be de-convolved from the stellar noise using a single time series spanning several eclipse events. Such non-parametric techniques can provide important confirmations of the existent parametric corrections reported in the literature, and their associated results. Additionally they can substantially improve the precision exoplanetary light curve analysis in the future.Comment: ApJ accepte
    • …
    corecore