409 research outputs found
Differential fast fixed-point algorithms for underdetermined instantaneous and convolutive partial blind source separation
This paper concerns underdetermined linear instantaneous and convolutive
blind source separation (BSS), i.e., the case when the number of observed mixed
signals is lower than the number of sources.We propose partial BSS methods,
which separate supposedly nonstationary sources of interest (while keeping
residual components for the other, supposedly stationary, "noise" sources).
These methods are based on the general differential BSS concept that we
introduced before. In the instantaneous case, the approach proposed in this
paper consists of a differential extension of the FastICA method (which does
not apply to underdetermined mixtures). In the convolutive case, we extend our
recent time-domain fast fixed-point C-FICA algorithm to underdetermined
mixtures. Both proposed approaches thus keep the attractive features of the
FastICA and C-FICA methods. Our approaches are based on differential sphering
processes, followed by the optimization of the differential nonnormalized
kurtosis that we introduce in this paper. Experimental tests show that these
differential algorithms are much more robust to noise sources than the standard
FastICA and C-FICA algorithms.Comment: this paper describes our differential FastICA-like algorithms for
linear instantaneous and convolutive underdetermined mixture
Underdetermined blind source separation based on Fuzzy C-Means and Semi-Nonnegative Matrix Factorization
Conventional blind source separation is based on over-determined with more sensors than sources but the underdetermined is a challenging case and more convenient to actual situation. Non-negative Matrix Factorization (NMF) has been widely applied to Blind Source Separation (BSS) problems. However, the separation results are sensitive to the initialization of parameters of NMF. Avoiding the subjectivity of choosing parameters, we used the Fuzzy C-Means (FCM) clustering technique to estimate the mixing matrix and to reduce the requirement for sparsity. Also, decreasing the constraints is regarded in this paper by using Semi-NMF. In this paper we propose a new two-step algorithm in order to solve the underdetermined blind source separation. We show how to combine the FCM clustering technique with the gradient-based NMF with the multi-layer technique. The simulation results show that our proposed algorithm can separate the source signals with high signal-to-noise ratio and quite low cost time compared with some algorithms
Blind separation of underdetermined mixtures with additive white and pink noises
This paper presents an approach for underdetermined
blind source separation in the case of additive Gaussian
white noise and pink noise. Likewise, the proposed approach is applicable in the case of separating I + 3 sources from I mixtures with additive two kinds of noises. This situation is more challenging and suitable to practical real world problems. Moreover, unlike to some conventional approaches, the sparsity conditions are not imposed. Firstly, the mixing matrix is estimated based on an algorithm that combines short time Fourier transform and rough-fuzzy clustering. Then, the mixed
signals are normalized and the source signals are recovered using modified Gradient descent Local Hierarchical Alternating Least Squares Algorithm exploiting the mixing matrix obtained from the previous step as an input and initialized by multiplicative algorithm for matrix factorization based on alpha divergence. The experiments and simulation results
show that the proposed approach can separate I + 3 source
signals from I mixed signals, and it has superior evaluation performance compared to some conventional approaches
Audio Source Separation Using Sparse Representations
This is the author's final version of the article, first published as A. Nesbit, M. G. Jafari, E. Vincent and M. D. Plumbley. Audio Source Separation Using Sparse Representations. In W. Wang (Ed), Machine Audition: Principles, Algorithms and Systems. Chapter 10, pp. 246-264. IGI Global, 2011. ISBN 978-1-61520-919-4. DOI: 10.4018/978-1-61520-919-4.ch010file: NesbitJafariVincentP11-audio.pdf:n\NesbitJafariVincentP11-audio.pdf:PDF owner: markp timestamp: 2011.02.04file: NesbitJafariVincentP11-audio.pdf:n\NesbitJafariVincentP11-audio.pdf:PDF owner: markp timestamp: 2011.02.04The authors address the problem of audio source separation, namely, the recovery of audio signals from recordings of mixtures of those signals. The sparse component analysis framework is a powerful method for achieving this. Sparse orthogonal transforms, in which only few transform coefficients differ significantly from zero, are developed; once the signal has been transformed, energy is apportioned from each transform coefficient to each estimated source, and, finally, the signal is reconstructed using the inverse transform. The overriding aim of this chapter is to demonstrate how this framework, as exemplified here by two different decomposition methods which adapt to the signal to represent it sparsely, can be used to solve different problems in different mixing scenarios. To address the instantaneous (neither delays nor echoes) and underdetermined (more sources than mixtures) mixing model, a lapped orthogonal transform is adapted to the signal by selecting a basis from a library of predetermined bases. This method is highly related to the windowing methods used in the MPEG audio coding framework. In considering the anechoic (delays but no echoes) and determined (equal number of sources and mixtures) mixing case, a greedy adaptive transform is used based on orthogonal basis functions that are learned from the observed data, instead of being selected from a predetermined library of bases. This is found to encode the signal characteristics, by introducing a feedback system between the bases and the observed data. Experiments on mixtures of speech and music signals demonstrate that these methods give good signal approximations and separation performance, and indicate promising directions for future research
Structured Sparsity Models for Multiparty Speech Recovery from Reverberant Recordings
We tackle the multi-party speech recovery problem through modeling the
acoustic of the reverberant chambers. Our approach exploits structured sparsity
models to perform room modeling and speech recovery. We propose a scheme for
characterizing the room acoustic from the unknown competing speech sources
relying on localization of the early images of the speakers by sparse
approximation of the spatial spectra of the virtual sources in a free-space
model. The images are then clustered exploiting the low-rank structure of the
spectro-temporal components belonging to each source. This enables us to
identify the early support of the room impulse response function and its unique
map to the room geometry. To further tackle the ambiguity of the reflection
ratios, we propose a novel formulation of the reverberation model and estimate
the absorption coefficients through a convex optimization exploiting joint
sparsity model formulated upon spatio-spectral sparsity of concurrent speech
representation. The acoustic parameters are then incorporated for separating
individual speech signals through either structured sparse recovery or inverse
filtering the acoustic channels. The experiments conducted on real data
recordings demonstrate the effectiveness of the proposed approach for
multi-party speech recovery and recognition.Comment: 31 page
Jointly Tracking and Separating Speech Sources Using Multiple Features and the generalized labeled multi-Bernoulli Framework
This paper proposes a novel joint multi-speaker tracking-and-separation
method based on the generalized labeled multi-Bernoulli (GLMB) multi-target
tracking filter, using sound mixtures recorded by microphones. Standard
multi-speaker tracking algorithms usually only track speaker locations, and
ambiguity occurs when speakers are spatially close. The proposed multi-feature
GLMB tracking filter treats the set of vectors of associated speaker features
(location, pitch and sound) as the multi-target multi-feature observation,
characterizes transitioning features with corresponding transition models and
overall likelihood function, thus jointly tracks and separates each
multi-feature speaker, and addresses the spatial ambiguity problem. Numerical
evaluation verifies that the proposed method can correctly track locations of
multiple speakers and meanwhile separate speech signals
- âŠ