504 research outputs found
Audio Source Separation Using Sparse Representations
This is the author's final version of the article, first published as A. Nesbit, M. G. Jafari, E. Vincent and M. D. Plumbley. Audio Source Separation Using Sparse Representations. In W. Wang (Ed), Machine Audition: Principles, Algorithms and Systems. Chapter 10, pp. 246-264. IGI Global, 2011. ISBN 978-1-61520-919-4. DOI: 10.4018/978-1-61520-919-4.ch010file: NesbitJafariVincentP11-audio.pdf:n\NesbitJafariVincentP11-audio.pdf:PDF owner: markp timestamp: 2011.02.04file: NesbitJafariVincentP11-audio.pdf:n\NesbitJafariVincentP11-audio.pdf:PDF owner: markp timestamp: 2011.02.04The authors address the problem of audio source separation, namely, the recovery of audio signals from recordings of mixtures of those signals. The sparse component analysis framework is a powerful method for achieving this. Sparse orthogonal transforms, in which only few transform coefficients differ significantly from zero, are developed; once the signal has been transformed, energy is apportioned from each transform coefficient to each estimated source, and, finally, the signal is reconstructed using the inverse transform. The overriding aim of this chapter is to demonstrate how this framework, as exemplified here by two different decomposition methods which adapt to the signal to represent it sparsely, can be used to solve different problems in different mixing scenarios. To address the instantaneous (neither delays nor echoes) and underdetermined (more sources than mixtures) mixing model, a lapped orthogonal transform is adapted to the signal by selecting a basis from a library of predetermined bases. This method is highly related to the windowing methods used in the MPEG audio coding framework. In considering the anechoic (delays but no echoes) and determined (equal number of sources and mixtures) mixing case, a greedy adaptive transform is used based on orthogonal basis functions that are learned from the observed data, instead of being selected from a predetermined library of bases. This is found to encode the signal characteristics, by introducing a feedback system between the bases and the observed data. Experiments on mixtures of speech and music signals demonstrate that these methods give good signal approximations and separation performance, and indicate promising directions for future research
A fast approach for overcomplete sparse decomposition based on smoothed L0 norm
In this paper, a fast algorithm for overcomplete sparse decomposition, called
SL0, is proposed. The algorithm is essentially a method for obtaining sparse
solutions of underdetermined systems of linear equations, and its applications
include underdetermined Sparse Component Analysis (SCA), atomic decomposition
on overcomplete dictionaries, compressed sensing, and decoding real field
codes. Contrary to previous methods, which usually solve this problem by
minimizing the L1 norm using Linear Programming (LP) techniques, our algorithm
tries to directly minimize the L0 norm. It is experimentally shown that the
proposed algorithm is about two to three orders of magnitude faster than the
state-of-the-art interior-point LP solvers, while providing the same (or
better) accuracy.Comment: Accepted in IEEE Transactions on Signal Processing. For MATLAB codes,
see (http://ee.sharif.ir/~SLzero). File replaced, because Fig. 5 was missing
erroneousl
On the stable recovery of the sparsest overcomplete representations in presence of noise
Let x be a signal to be sparsely decomposed over a redundant dictionary A,
i.e., a sparse coefficient vector s has to be found such that x=As. It is known
that this problem is inherently unstable against noise, and to overcome this
instability, the authors of [Stable Recovery; Donoho et.al., 2006] have
proposed to use an "approximate" decomposition, that is, a decomposition
satisfying ||x - A s|| < \delta, rather than satisfying the exact equality x =
As. Then, they have shown that if there is a decomposition with ||s||_0 <
(1+M^{-1})/2, where M denotes the coherence of the dictionary, this
decomposition would be stable against noise. On the other hand, it is known
that a sparse decomposition with ||s||_0 < spark(A)/2 is unique. In other
words, although a decomposition with ||s||_0 < spark(A)/2 is unique, its
stability against noise has been proved only for highly more restrictive
decompositions satisfying ||s||_0 < (1+M^{-1})/2, because usually (1+M^{-1})/2
<< spark(A)/2.
This limitation maybe had not been very important before, because ||s||_0 <
(1+M^{-1})/2 is also the bound which guaranties that the sparse decomposition
can be found via minimizing the L1 norm, a classic approach for sparse
decomposition. However, with the availability of new algorithms for sparse
decomposition, namely SL0 and Robust-SL0, it would be important to know whether
or not unique sparse decompositions with (1+M^{-1})/2 < ||s||_0 < spark(A)/2
are stable. In this paper, we show that such decompositions are indeed stable.
In other words, we extend the stability bound from ||s||_0 < (1+M^{-1})/2 to
the whole uniqueness range ||s||_0 < spark(A)/2. In summary, we show that "all
unique sparse decompositions are stably recoverable". Moreover, we see that
sparser decompositions are "more stable".Comment: Accepted in IEEE Trans on SP on 4 May 2010. (c) 2010 IEEE. Personal
use of this material is permitted. Permission from IEEE must be obtained for
all other users, including reprinting/republishing this material for
advertising or promotional purposes, creating new collective works for resale
or redistribution to servers or lists, or reuse of any copyrighted components
of this work in other work
Blind Source Separation with Compressively Sensed Linear Mixtures
This work studies the problem of simultaneously separating and reconstructing
signals from compressively sensed linear mixtures. We assume that all source
signals share a common sparse representation basis. The approach combines
classical Compressive Sensing (CS) theory with a linear mixing model. It allows
the mixtures to be sampled independently of each other. If samples are acquired
in the time domain, this means that the sensors need not be synchronized. Since
Blind Source Separation (BSS) from a linear mixture is only possible up to
permutation and scaling, factoring out these ambiguities leads to a
minimization problem on the so-called oblique manifold. We develop a geometric
conjugate subgradient method that scales to large systems for solving the
problem. Numerical results demonstrate the promising performance of the
proposed algorithm compared to several state of the art methods.Comment: 9 pages, 2 figure
Image Decomposition and Separation Using Sparse Representations: An Overview
This paper gives essential insights into the use of sparsity and morphological diversity in image decomposition and source separation by reviewing our recent work in this field. The idea to morphologically decompose a signal into its building blocks is an important problem in signal processing and has far-reaching applications in science and technology. Starck , proposed a novel decomposition method—morphological component analysis (MCA)—based on sparse representation of signals. MCA assumes that each (monochannel) signal is the linear mixture of several layers, the so-called morphological components, that are morphologically distinct, e.g., sines and bumps. The success of this method relies on two tenets: sparsity and morphological diversity. That is, each morphological component is sparsely represented in a specific transform domain, and the latter is highly inefficient in representing the other content in the mixture. Once such transforms are identified, MCA is an iterative thresholding algorithm that is capable of decoupling the signal content. Sparsity and morphological diversity have also been used as a novel and effective source of diversity for blind source separation (BSS), hence extending the MCA to multichannel data. Building on these ingredients, we will provide an overview the generalized MCA introduced by the authors in and as a fast and efficient BSS method. We will illustrate the application of these algorithms on several real examples. We conclude our tour by briefly describing our software toolboxes made available for download on the Internet for sparse signal and image decomposition and separation
Covariance-domain Dictionary Learning for Overcomplete EEG Source Identification
We propose an algorithm targeting the identification of more sources than
channels for electroencephalography (EEG). Our overcomplete source
identification algorithm, Cov-DL, leverages dictionary learning methods applied
in the covariance-domain. Assuming that EEG sources are uncorrelated within
moving time-windows and the scalp mixing is linear, the forward problem can be
transferred to the covariance domain which has higher dimensionality than the
original EEG channel domain. This allows for learning the overcomplete mixing
matrix that generates the scalp EEG even when there may be more sources than
sensors active at any time segment, i.e. when there are non-sparse sources.
This is contrary to straight-forward dictionary learning methods that are based
on the assumption of sparsity, which is not a satisfied condition in the case
of low-density EEG systems. We present two different learning strategies for
Cov-DL, determined by the size of the target mixing matrix. We demonstrate that
Cov-DL outperforms existing overcomplete ICA algorithms under various scenarios
of EEG simulations and real EEG experiments
Multi-modal dictionary learning for image separation with application in art investigation
In support of art investigation, we propose a new source separation method
that unmixes a single X-ray scan acquired from double-sided paintings. In this
problem, the X-ray signals to be separated have similar morphological
characteristics, which brings previous source separation methods to their
limits. Our solution is to use photographs taken from the front and back-side
of the panel to drive the separation process. The crux of our approach relies
on the coupling of the two imaging modalities (photographs and X-rays) using a
novel coupled dictionary learning framework able to capture both common and
disparate features across the modalities using parsimonious representations;
the common component models features shared by the multi-modal images, whereas
the innovation component captures modality-specific information. As such, our
model enables the formulation of appropriately regularized convex optimization
procedures that lead to the accurate separation of the X-rays. Our dictionary
learning framework can be tailored both to a single- and a multi-scale
framework, with the latter leading to a significant performance improvement.
Moreover, to improve further on the visual quality of the separated images, we
propose to train coupled dictionaries that ignore certain parts of the painting
corresponding to craquelure. Experimentation on synthetic and real data - taken
from digital acquisition of the Ghent Altarpiece (1432) - confirms the
superiority of our method against the state-of-the-art morphological component
analysis technique that uses either fixed or trained dictionaries to perform
image separation.Comment: submitted to IEEE Transactions on Images Processin
- …