827 research outputs found

    Non-negative mixtures

    Get PDF
    This is the author's accepted pre-print of the article, first published as M. D. Plumbley, A. Cichocki and R. Bro. Non-negative mixtures. In P. Comon and C. Jutten (Ed), Handbook of Blind Source Separation: Independent Component Analysis and Applications. Chapter 13, pp. 515-547. Academic Press, Feb 2010. ISBN 978-0-12-374726-6 DOI: 10.1016/B978-0-12-374726-6.00018-7file: Proof:p\PlumbleyCichockiBro10-non-negative.pdf:PDF owner: markp timestamp: 2011.04.26file: Proof:p\PlumbleyCichockiBro10-non-negative.pdf:PDF owner: markp timestamp: 2011.04.2

    Audio Source Separation Using Sparse Representations

    Get PDF
    This is the author's final version of the article, first published as A. Nesbit, M. G. Jafari, E. Vincent and M. D. Plumbley. Audio Source Separation Using Sparse Representations. In W. Wang (Ed), Machine Audition: Principles, Algorithms and Systems. Chapter 10, pp. 246-264. IGI Global, 2011. ISBN 978-1-61520-919-4. DOI: 10.4018/978-1-61520-919-4.ch010file: NesbitJafariVincentP11-audio.pdf:n\NesbitJafariVincentP11-audio.pdf:PDF owner: markp timestamp: 2011.02.04file: NesbitJafariVincentP11-audio.pdf:n\NesbitJafariVincentP11-audio.pdf:PDF owner: markp timestamp: 2011.02.04The authors address the problem of audio source separation, namely, the recovery of audio signals from recordings of mixtures of those signals. The sparse component analysis framework is a powerful method for achieving this. Sparse orthogonal transforms, in which only few transform coefficients differ significantly from zero, are developed; once the signal has been transformed, energy is apportioned from each transform coefficient to each estimated source, and, finally, the signal is reconstructed using the inverse transform. The overriding aim of this chapter is to demonstrate how this framework, as exemplified here by two different decomposition methods which adapt to the signal to represent it sparsely, can be used to solve different problems in different mixing scenarios. To address the instantaneous (neither delays nor echoes) and underdetermined (more sources than mixtures) mixing model, a lapped orthogonal transform is adapted to the signal by selecting a basis from a library of predetermined bases. This method is highly related to the windowing methods used in the MPEG audio coding framework. In considering the anechoic (delays but no echoes) and determined (equal number of sources and mixtures) mixing case, a greedy adaptive transform is used based on orthogonal basis functions that are learned from the observed data, instead of being selected from a predetermined library of bases. This is found to encode the signal characteristics, by introducing a feedback system between the bases and the observed data. Experiments on mixtures of speech and music signals demonstrate that these methods give good signal approximations and separation performance, and indicate promising directions for future research

    Multimodal methods for blind source separation of audio sources

    Get PDF
    The enhancement of the performance of frequency domain convolutive blind source separation (FDCBSS) techniques when applied to the problem of separating audio sources recorded in a room environment is the focus of this thesis. This challenging application is termed the cocktail party problem and the ultimate aim would be to build a machine which matches the ability of a human being to solve this task. Human beings exploit both their eyes and their ears in solving this task and hence they adopt a multimodal approach, i.e. they exploit both audio and video modalities. New multimodal methods for blind source separation of audio sources are therefore proposed in this work as a step towards realizing such a machine. The geometry of the room environment is initially exploited to improve the separation performance of a FDCBSS algorithm. The positions of the human speakers are monitored by video cameras and this information is incorporated within the FDCBSS algorithm in the form of constraints added to the underlying cross-power spectral density matrix-based cost function which measures separation performance. [Continues.

    Sparsity and adaptivity for the blind separation of partially correlated sources

    Get PDF
    Blind source separation (BSS) is a very popular technique to analyze multichannel data. In this context, the data are modeled as the linear combination of sources to be retrieved. For that purpose, standard BSS methods all rely on some discrimination principle, whether it is statistical independence or morphological diversity, to distinguish between the sources. However, dealing with real-world data reveals that such assumptions are rarely valid in practice: the signals of interest are more likely partially correlated, which generally hampers the performances of standard BSS methods. In this article, we introduce a novel sparsity-enforcing BSS method coined Adaptive Morphological Component Analysis (AMCA), which is designed to retrieve sparse and partially correlated sources. More precisely, it makes profit of an adaptive re-weighting scheme to favor/penalize samples based on their level of correlation. Extensive numerical experiments have been carried out which show that the proposed method is robust to the partial correlation of sources while standard BSS techniques fail. The AMCA algorithm is evaluated in the field of astrophysics for the separation of physical components from microwave data.Comment: submitted to IEEE Transactions on signal processin

    Overcomplete Blind Source Separation by Combining ICA and Binary Time-Frequency Masking

    Get PDF
    A limitation in many source separation tasks is that the number of source signals has to be known in advance. Further, in order to achieve good performance, the number of sources cannot exceed the number of sensors. In many real-world applications these limitations are too strict. We propose a novel method for over-complete blind source separation. Two powerful source separation techniques have been combined, independent component analysis and binary time-frequency masking. Hereby, it is possible to iteratively extract each speech signal from the mixture. By using merely two microphones we can separate up to six mixed speech signals under anechoic conditions. The number of source signals is not assumed to be known in advance. It is also possible to maintain the extracted signals as stereo signal
    • …
    corecore