18,706 research outputs found
Probabilistic Modeling Paradigms for Audio Source Separation
This is the author's final version of the article, first published as E. Vincent, M. G. Jafari, S. A. Abdallah, M. D. Plumbley, M. E. Davies. Probabilistic Modeling Paradigms for Audio Source Separation. In W. Wang (Ed), Machine Audition: Principles, Algorithms and Systems. Chapter 7, pp. 162-185. IGI Global, 2011. ISBN 978-1-61520-919-4. DOI: 10.4018/978-1-61520-919-4.ch007file: VincentJafariAbdallahPD11-probabilistic.pdf:v\VincentJafariAbdallahPD11-probabilistic.pdf:PDF owner: markp timestamp: 2011.02.04file: VincentJafariAbdallahPD11-probabilistic.pdf:v\VincentJafariAbdallahPD11-probabilistic.pdf:PDF owner: markp timestamp: 2011.02.04Most sound scenes result from the superposition of several sources, which can be separately perceived and analyzed by human listeners. Source separation aims to provide machine listeners with similar skills by extracting the sounds of individual sources from a given scene. Existing separation systems operate either by emulating the human auditory system or by inferring the parameters of probabilistic sound models. In this chapter, the authors focus on the latter approach and provide a joint overview of established and recent models, including independent component analysis, local time-frequency models and spectral template-based models. They show that most models are instances of one of the following two general paradigms: linear modeling or variance modeling. They compare the merits of either paradigm and report objective performance figures. They also,conclude by discussing promising combinations of probabilistic priors and inference algorithms that could form the basis of future state-of-the-art systems
Image Decomposition and Separation Using Sparse Representations: An Overview
This paper gives essential insights into the use of sparsity and morphological diversity in image decomposition and source separation by reviewing our recent work in this field. The idea to morphologically decompose a signal into its building blocks is an important problem in signal processing and has far-reaching applications in science and technology. Starck , proposed a novel decomposition method—morphological component analysis (MCA)—based on sparse representation of signals. MCA assumes that each (monochannel) signal is the linear mixture of several layers, the so-called morphological components, that are morphologically distinct, e.g., sines and bumps. The success of this method relies on two tenets: sparsity and morphological diversity. That is, each morphological component is sparsely represented in a specific transform domain, and the latter is highly inefficient in representing the other content in the mixture. Once such transforms are identified, MCA is an iterative thresholding algorithm that is capable of decoupling the signal content. Sparsity and morphological diversity have also been used as a novel and effective source of diversity for blind source separation (BSS), hence extending the MCA to multichannel data. Building on these ingredients, we will provide an overview the generalized MCA introduced by the authors in and as a fast and efficient BSS method. We will illustrate the application of these algorithms on several real examples. We conclude our tour by briefly describing our software toolboxes made available for download on the Internet for sparse signal and image decomposition and separation
Sparsity and adaptivity for the blind separation of partially correlated sources
Blind source separation (BSS) is a very popular technique to analyze
multichannel data. In this context, the data are modeled as the linear
combination of sources to be retrieved. For that purpose, standard BSS methods
all rely on some discrimination principle, whether it is statistical
independence or morphological diversity, to distinguish between the sources.
However, dealing with real-world data reveals that such assumptions are rarely
valid in practice: the signals of interest are more likely partially
correlated, which generally hampers the performances of standard BSS methods.
In this article, we introduce a novel sparsity-enforcing BSS method coined
Adaptive Morphological Component Analysis (AMCA), which is designed to retrieve
sparse and partially correlated sources. More precisely, it makes profit of an
adaptive re-weighting scheme to favor/penalize samples based on their level of
correlation. Extensive numerical experiments have been carried out which show
that the proposed method is robust to the partial correlation of sources while
standard BSS techniques fail. The AMCA algorithm is evaluated in the field of
astrophysics for the separation of physical components from microwave data.Comment: submitted to IEEE Transactions on signal processin
Multi-modal dictionary learning for image separation with application in art investigation
In support of art investigation, we propose a new source separation method
that unmixes a single X-ray scan acquired from double-sided paintings. In this
problem, the X-ray signals to be separated have similar morphological
characteristics, which brings previous source separation methods to their
limits. Our solution is to use photographs taken from the front and back-side
of the panel to drive the separation process. The crux of our approach relies
on the coupling of the two imaging modalities (photographs and X-rays) using a
novel coupled dictionary learning framework able to capture both common and
disparate features across the modalities using parsimonious representations;
the common component models features shared by the multi-modal images, whereas
the innovation component captures modality-specific information. As such, our
model enables the formulation of appropriately regularized convex optimization
procedures that lead to the accurate separation of the X-rays. Our dictionary
learning framework can be tailored both to a single- and a multi-scale
framework, with the latter leading to a significant performance improvement.
Moreover, to improve further on the visual quality of the separated images, we
propose to train coupled dictionaries that ignore certain parts of the painting
corresponding to craquelure. Experimentation on synthetic and real data - taken
from digital acquisition of the Ghent Altarpiece (1432) - confirms the
superiority of our method against the state-of-the-art morphological component
analysis technique that uses either fixed or trained dictionaries to perform
image separation.Comment: submitted to IEEE Transactions on Images Processin
Structured Sparsity Models for Multiparty Speech Recovery from Reverberant Recordings
We tackle the multi-party speech recovery problem through modeling the
acoustic of the reverberant chambers. Our approach exploits structured sparsity
models to perform room modeling and speech recovery. We propose a scheme for
characterizing the room acoustic from the unknown competing speech sources
relying on localization of the early images of the speakers by sparse
approximation of the spatial spectra of the virtual sources in a free-space
model. The images are then clustered exploiting the low-rank structure of the
spectro-temporal components belonging to each source. This enables us to
identify the early support of the room impulse response function and its unique
map to the room geometry. To further tackle the ambiguity of the reflection
ratios, we propose a novel formulation of the reverberation model and estimate
the absorption coefficients through a convex optimization exploiting joint
sparsity model formulated upon spatio-spectral sparsity of concurrent speech
representation. The acoustic parameters are then incorporated for separating
individual speech signals through either structured sparse recovery or inverse
filtering the acoustic channels. The experiments conducted on real data
recordings demonstrate the effectiveness of the proposed approach for
multi-party speech recovery and recognition.Comment: 31 page
Audio Source Separation Using Sparse Representations
This is the author's final version of the article, first published as A. Nesbit, M. G. Jafari, E. Vincent and M. D. Plumbley. Audio Source Separation Using Sparse Representations. In W. Wang (Ed), Machine Audition: Principles, Algorithms and Systems. Chapter 10, pp. 246-264. IGI Global, 2011. ISBN 978-1-61520-919-4. DOI: 10.4018/978-1-61520-919-4.ch010file: NesbitJafariVincentP11-audio.pdf:n\NesbitJafariVincentP11-audio.pdf:PDF owner: markp timestamp: 2011.02.04file: NesbitJafariVincentP11-audio.pdf:n\NesbitJafariVincentP11-audio.pdf:PDF owner: markp timestamp: 2011.02.04The authors address the problem of audio source separation, namely, the recovery of audio signals from recordings of mixtures of those signals. The sparse component analysis framework is a powerful method for achieving this. Sparse orthogonal transforms, in which only few transform coefficients differ significantly from zero, are developed; once the signal has been transformed, energy is apportioned from each transform coefficient to each estimated source, and, finally, the signal is reconstructed using the inverse transform. The overriding aim of this chapter is to demonstrate how this framework, as exemplified here by two different decomposition methods which adapt to the signal to represent it sparsely, can be used to solve different problems in different mixing scenarios. To address the instantaneous (neither delays nor echoes) and underdetermined (more sources than mixtures) mixing model, a lapped orthogonal transform is adapted to the signal by selecting a basis from a library of predetermined bases. This method is highly related to the windowing methods used in the MPEG audio coding framework. In considering the anechoic (delays but no echoes) and determined (equal number of sources and mixtures) mixing case, a greedy adaptive transform is used based on orthogonal basis functions that are learned from the observed data, instead of being selected from a predetermined library of bases. This is found to encode the signal characteristics, by introducing a feedback system between the bases and the observed data. Experiments on mixtures of speech and music signals demonstrate that these methods give good signal approximations and separation performance, and indicate promising directions for future research
- …