6,374 research outputs found
Sparse BSS in the presence of outliers
submitted to SPARS15—While real-world data are often grossly corrupted, most techniques of blind source separation (BSS) give erroneous results in the presence of outliers. We propose a robust algorithm that jointly estimates the sparse sources and outliers without requiring any prior knowledge on the outliers. More precisely, it uses an alternative weighted scheme to weaken the influence of the estimated outliers. A preliminary experiment is presented and demonstrates the advantage of the proposed algorithm in comparison with state-of-the-art BSS methods. I. PROBLEM FORMULATION Suppose we are given m noisy observations {Xi} i=1..m of unknown linear mixtures of n ≤ m sparse sources {Sj} j=1..n with t > m samples. It is generally assumed that these data are corrupted by a Gaussian noise, accounting for instrumental or model imperfections. However in many applications, some entries are additionally corrupted by outliers, leading to the following model: X = AS + O + N, with X the observations, A the mixing matrix, S the sources, O the outliers, and N the Gaussian noise. In the presence of outliers, the key difficulty lies in separating the components O and AS. To this end, assuming that the term AS has low-rank, some strategies [4] suggest to pre-process the data to estimate and remove the outliers with RPCA [3]. However, besides the fact that low-rankness is generally restrictive for most BSS problems, the source separation is severely hampered if the outliers are not well estimated. Therefore, we introduce a method that estimates the sources in the presence of the outliers without pre-processing. For the best of our knowledge, it has only been studied in [5] by using the β-divergence. Unlike [5], we propose to estimate jointly the outliers and the sources by exploiting their sparsity. II. ALGORITH
Direction of Arrival with One Microphone, a few LEGOs, and Non-Negative Matrix Factorization
Conventional approaches to sound source localization require at least two
microphones. It is known, however, that people with unilateral hearing loss can
also localize sounds. Monaural localization is possible thanks to the
scattering by the head, though it hinges on learning the spectra of the various
sources. We take inspiration from this human ability to propose algorithms for
accurate sound source localization using a single microphone embedded in an
arbitrary scattering structure. The structure modifies the frequency response
of the microphone in a direction-dependent way giving each direction a
signature. While knowing those signatures is sufficient to localize sources of
white noise, localizing speech is much more challenging: it is an ill-posed
inverse problem which we regularize by prior knowledge in the form of learned
non-negative dictionaries. We demonstrate a monaural speech localization
algorithm based on non-negative matrix factorization that does not depend on
sophisticated, designed scatterers. In fact, we show experimental results with
ad hoc scatterers made of LEGO bricks. Even with these rudimentary structures
we can accurately localize arbitrary speakers; that is, we do not need to learn
the dictionary for the particular speaker to be localized. Finally, we discuss
multi-source localization and the related limitations of our approach.Comment: This article has been accepted for publication in IEEE/ACM
Transactions on Audio, Speech, and Language processing (TASLP
Shift-Invariant Kernel Additive Modelling for Audio Source Separation
A major goal in blind source separation to identify and separate sources is
to model their inherent characteristics. While most state-of-the-art approaches
are supervised methods trained on large datasets, interest in non-data-driven
approaches such as Kernel Additive Modelling (KAM) remains high due to their
interpretability and adaptability. KAM performs the separation of a given
source applying robust statistics on the time-frequency bins selected by a
source-specific kernel function, commonly the K-NN function. This choice
assumes that the source of interest repeats in both time and frequency. In
practice, this assumption does not always hold. Therefore, we introduce a
shift-invariant kernel function capable of identifying similar spectral content
even under frequency shifts. This way, we can considerably increase the amount
of suitable sound material available to the robust statistics. While this leads
to an increase in separation performance, a basic formulation, however, is
computationally expensive. Therefore, we additionally present acceleration
techniques that lower the overall computational complexity.Comment: Feedback is welcom
A Semi-Blind Source Separation Method for Differential Optical Absorption Spectroscopy of Atmospheric Gas Mixtures
Differential optical absorption spectroscopy (DOAS) is a powerful tool for
detecting and quantifying trace gases in atmospheric chemistry
\cite{Platt_Stutz08}. DOAS spectra consist of a linear combination of complex
multi-peak multi-scale structures. Most DOAS analysis routines in use today are
based on least squares techniques, for example, the approach developed in the
1970s uses polynomial fits to remove a slowly varying background, and known
reference spectra to retrieve the identity and concentrations of reference
gases. An open problem is to identify unknown gases in the fitting residuals
for complex atmospheric mixtures.
In this work, we develop a novel three step semi-blind source separation
method. The first step uses a multi-resolution analysis to remove the
slow-varying and fast-varying components in the DOAS spectral data matrix .
The second step decomposes the preprocessed data in the first step
into a linear combination of the reference spectra plus a remainder, or
, where columns of matrix are known reference spectra,
and the matrix contains the unknown non-negative coefficients that are
proportional to concentration. The second step is realized by a convex
minimization problem ,
where the norm is a hybrid norm (Huber estimator) that helps to
maintain the non-negativity of . The third step performs a blind independent
component analysis of the remainder matrix to extract remnant gas
components. We first illustrate the proposed method in processing a set of DOAS
experimental data by a satisfactory blind extraction of an a-priori unknown
trace gas (ozone) from the remainder matrix. Numerical results also show that
the method can identify multiple trace gases from the residuals.Comment: submitted to Journal of Scientific Computin
Joint Tensor Factorization and Outlying Slab Suppression with Applications
We consider factoring low-rank tensors in the presence of outlying slabs.
This problem is important in practice, because data collected in many
real-world applications, such as speech, fluorescence, and some social network
data, fit this paradigm. Prior work tackles this problem by iteratively
selecting a fixed number of slabs and fitting, a procedure which may not
converge. We formulate this problem from a group-sparsity promoting point of
view, and propose an alternating optimization framework to handle the
corresponding () minimization-based low-rank tensor
factorization problem. The proposed algorithm features a similar per-iteration
complexity as the plain trilinear alternating least squares (TALS) algorithm.
Convergence of the proposed algorithm is also easy to analyze under the
framework of alternating optimization and its variants. In addition,
regularization and constraints can be easily incorporated to make use of
\emph{a priori} information on the latent loading factors. Simulations and real
data experiments on blind speech separation, fluorescence data analysis, and
social network mining are used to showcase the effectiveness of the proposed
algorithm
Acoustic Space Learning for Sound Source Separation and Localization on Binaural Manifolds
In this paper we address the problems of modeling the acoustic space
generated by a full-spectrum sound source and of using the learned model for
the localization and separation of multiple sources that simultaneously emit
sparse-spectrum sounds. We lay theoretical and methodological grounds in order
to introduce the binaural manifold paradigm. We perform an in-depth study of
the latent low-dimensional structure of the high-dimensional interaural
spectral data, based on a corpus recorded with a human-like audiomotor robot
head. A non-linear dimensionality reduction technique is used to show that
these data lie on a two-dimensional (2D) smooth manifold parameterized by the
motor states of the listener, or equivalently, the sound source directions. We
propose a probabilistic piecewise affine mapping model (PPAM) specifically
designed to deal with high-dimensional data exhibiting an intrinsic piecewise
linear structure. We derive a closed-form expectation-maximization (EM)
procedure for estimating the model parameters, followed by Bayes inversion for
obtaining the full posterior density function of a sound source direction. We
extend this solution to deal with missing data and redundancy in real world
spectrograms, and hence for 2D localization of natural sound sources such as
speech. We further generalize the model to the challenging case of multiple
sound sources and we propose a variational EM framework. The associated
algorithm, referred to as variational EM for source separation and localization
(VESSL) yields a Bayesian estimation of the 2D locations and time-frequency
masks of all the sources. Comparisons of the proposed approach with several
existing methods reveal that the combination of acoustic-space learning with
Bayesian inference enables our method to outperform state-of-the-art methods.Comment: 19 pages, 9 figures, 3 table
Dictionary Learning of Convolved Signals
Assuming that a set of source signals is sparsely representable in a given dictionary, we show how their sparse recovery fails whenever we can only measure a convolved observation of them. Starting from this motivation, we develop a block coordinate descent method which aims to learn a convolved dictionary and provide a sparse representation of the observed signals with small residual norm. We compare the proposed approach to the K-SVD dictionary learning algorithm and show through numerical experiment on synthetic signals that, provided some conditions on the problem data, our technique converges in a fixed number of iterations to a sparse representation with smaller residual norm
- …