1,273 research outputs found
Structured Sparsity Models for Multiparty Speech Recovery from Reverberant Recordings
We tackle the multi-party speech recovery problem through modeling the
acoustic of the reverberant chambers. Our approach exploits structured sparsity
models to perform room modeling and speech recovery. We propose a scheme for
characterizing the room acoustic from the unknown competing speech sources
relying on localization of the early images of the speakers by sparse
approximation of the spatial spectra of the virtual sources in a free-space
model. The images are then clustered exploiting the low-rank structure of the
spectro-temporal components belonging to each source. This enables us to
identify the early support of the room impulse response function and its unique
map to the room geometry. To further tackle the ambiguity of the reflection
ratios, we propose a novel formulation of the reverberation model and estimate
the absorption coefficients through a convex optimization exploiting joint
sparsity model formulated upon spatio-spectral sparsity of concurrent speech
representation. The acoustic parameters are then incorporated for separating
individual speech signals through either structured sparse recovery or inverse
filtering the acoustic channels. The experiments conducted on real data
recordings demonstrate the effectiveness of the proposed approach for
multi-party speech recovery and recognition.Comment: 31 page
Blind separation of underdetermined mixtures with additive white and pink noises
This paper presents an approach for underdetermined
blind source separation in the case of additive Gaussian
white noise and pink noise. Likewise, the proposed approach is applicable in the case of separating I + 3 sources from I mixtures with additive two kinds of noises. This situation is more challenging and suitable to practical real world problems. Moreover, unlike to some conventional approaches, the sparsity conditions are not imposed. Firstly, the mixing matrix is estimated based on an algorithm that combines short time Fourier transform and rough-fuzzy clustering. Then, the mixed
signals are normalized and the source signals are recovered using modified Gradient descent Local Hierarchical Alternating Least Squares Algorithm exploiting the mixing matrix obtained from the previous step as an input and initialized by multiplicative algorithm for matrix factorization based on alpha divergence. The experiments and simulation results
show that the proposed approach can separate I + 3 source
signals from I mixed signals, and it has superior evaluation performance compared to some conventional approaches
Exploitation of source nonstationarity in underdetermined blind source separation with advanced clustering techniques
The problem of blind source separation (BSS) is
investigated. Following the assumption that the time-frequency
(TF) distributions of the input sources do not overlap, quadratic
TF representation is used to exploit the sparsity of the statistically
nonstationary sources. However, separation performance is shown
to be limited by the selection of a certain threshold in classifying
the eigenvectors of the TF matrices drawn from the observation
mixtures. Two methods are, therefore, proposed based on recently
introduced advanced clustering techniques, namely Gap statistics
and self-splitting competitive learning (SSCL), to mitigate the
problem of eigenvector classification. The novel integration of
these two approaches successfully overcomes the problem of artificial
sources induced by insufficient knowledge of the threshold and
enables automatic determination of the number of active sources
over the observation. The separation performance is thereby
greatly improved. Practical consequences of violating the TF orthogonality
assumption in the current approach are also studied,
which motivates the proposal of a new solution robust to violation
of orthogonality. In this new method, the TF plane is partitioned
into appropriate blocks and source separation is thereby carried
out in a block-by-block manner. Numerical experiments with
linear chirp signals and Gaussian minimum shift keying (GMSK)
signals are included which support the improved performance of
the proposed approaches
Image Decomposition and Separation Using Sparse Representations: An Overview
This paper gives essential insights into the use of sparsity and morphological diversity in image decomposition and source separation by reviewing our recent work in this field. The idea to morphologically decompose a signal into its building blocks is an important problem in signal processing and has far-reaching applications in science and technology. Starck , proposed a novel decomposition method—morphological component analysis (MCA)—based on sparse representation of signals. MCA assumes that each (monochannel) signal is the linear mixture of several layers, the so-called morphological components, that are morphologically distinct, e.g., sines and bumps. The success of this method relies on two tenets: sparsity and morphological diversity. That is, each morphological component is sparsely represented in a specific transform domain, and the latter is highly inefficient in representing the other content in the mixture. Once such transforms are identified, MCA is an iterative thresholding algorithm that is capable of decoupling the signal content. Sparsity and morphological diversity have also been used as a novel and effective source of diversity for blind source separation (BSS), hence extending the MCA to multichannel data. Building on these ingredients, we will provide an overview the generalized MCA introduced by the authors in and as a fast and efficient BSS method. We will illustrate the application of these algorithms on several real examples. We conclude our tour by briefly describing our software toolboxes made available for download on the Internet for sparse signal and image decomposition and separation
Weakly Supervised Audio Source Separation via Spectrum Energy Preserved Wasserstein Learning
Separating audio mixtures into individual instrument tracks has been a long
standing challenging task. We introduce a novel weakly supervised audio source
separation approach based on deep adversarial learning. Specifically, our loss
function adopts the Wasserstein distance which directly measures the
distribution distance between the separated sources and the real sources for
each individual source. Moreover, a global regularization term is added to
fulfill the spectrum energy preservation property regardless separation. Unlike
state-of-the-art weakly supervised models which often involve deliberately
devised constraints or careful model selection, our approach need little prior
model specification on the data, and can be straightforwardly learned in an
end-to-end fashion. We show that the proposed method performs competitively on
public benchmark against state-of-the-art weakly supervised methods
- …