168 research outputs found
The Application of Blind Source Separation to Feature Decorrelation and Normalizations
We apply a Blind Source Separation BSS algorithm to the decorrelation of Mel-warped cepstra. The observed cepstra are modeled as a convolutive mixture of independent source cepstra. The algorithm aims to minimize a cross-spectral correlation at different lags to reconstruct the source cepstra. Results show that using "independent" cepstra as features leads to a reduction in the WER.Finally, we present three different enhancements to the BSS algorithm. We also present some results of these deviations of the original algorithm
A harmonic excitation state-space approach to blind separation of speech
We discuss an identification framework for noisy speech mixtures. A block-based generative model is formulated that explicitly incorporates the time-varying harmonic plus noise (H+N) model for a number of latent sources observed through noisy convolutive mixtures. All parameters including the pitches of the source signals, the amplitudes and phases of the sources, the mixing filters and the noise statistics are estimated by maximum likelihood, using an EM-algorithm. Exact averaging over the hidden sources is obtained using the Kalman smoother. We show that pitch estimation and source separation can be performed simultaneously. The pitch estimates are compared to laryngograph (EGG) measurements. Artificial and real room mixtures are used to demonstrate the viability of the approach. Intelligible speech signals are re-synthesized from the estimated H+N models
Euclid in a Taxicab: Sparse Blind Deconvolution with Smoothed l1/l2 Regularization
The l1/l2 ratio regularization function has shown good performance for
retrieving sparse signals in a number of recent works, in the context of blind
deconvolution. Indeed, it benefits from a scale invariance property much
desirable in the blind context. However, the l1/l2 function raises some
difficulties when solving the nonconvex and nonsmooth minimization problems
resulting from the use of such a penalty term in current restoration methods.
In this paper, we propose a new penalty based on a smooth approximation to the
l1/l2 function. In addition, we develop a proximal-based algorithm to solve
variational problems involving this function and we derive theoretical
convergence results. We demonstrate the effectiveness of our method through a
comparison with a recent alternating optimization strategy dealing with the
exact l1/l2 term, on an application to seismic data blind deconvolution.Comment: 5 page
Efficient Multiband Algorithms for Blind Source Separation
The problem of blind separation refers to recovering original signals, called source signals, from the mixed signals, called observation signals, in a reverberant environment. The mixture is a function of a sequence of original speech signals mixed in a reverberant room. The objective is to separate mixed signals to obtain the original signals without degradation and without prior information of the features of the sources. The strategy used to achieve this objective is to use multiple bands that work at a lower rate, have less computational cost and a quicker convergence than the conventional scheme. Our motivation is the competitive results of unequal-passbands scheme applications, in terms of the convergence speed. The objective of this research is to improve unequal-passbands schemes by improving the speed of convergence and reducing the computational cost. The first proposed work is a novel maximally decimated unequal-passbands scheme.This scheme uses multiple bands that make it work at a reduced sampling rate, and low computational cost. An adaptation approach is derived with an adaptation step that improved the convergence speed. The performance of the proposed scheme was measured in different ways. First, the mean square errors of various bands are measured and the results are compared to a maximally decimated equal-passbands scheme, which is currently the best performing method. The results show that the proposed scheme has a faster convergence rate than the maximally decimated equal-passbands scheme. Second, when the scheme is tested for white and coloured inputs using a low number of bands, it does not yield good results; but when the number of bands is increased, the speed of convergence is enhanced. Third, the scheme is tested for quick changes. It is shown that the performance of the proposed scheme is similar to that of the equal-passbands scheme. Fourth, the scheme is also tested in a stationary state. The experimental results confirm the theoretical work. For more challenging scenarios, an unequal-passbands scheme with over-sampled decimation is proposed; the greater number of bands, the more efficient the separation. The results are compared to the currently best performing method. Second, an experimental comparison is made between the proposed multiband scheme and the conventional scheme. The results show that the convergence speed and the signal-to-interference ratio of the proposed scheme are higher than that of the conventional scheme, and the computation cost is lower than that of the conventional scheme
Dirichlet latent variable model : a dynamic model based on Dirichlet prior for audio processing
We propose a dynamic latent variable model for learning latent bases from time varying, non-negative data. We take a probabilistic approach to modeling the temporal dependence in data by introducing a dynamic Dirichlet prior – a Dirichlet distribution with dynamic parameters. This new distribution allows us to assure non-negativity and avoid intractability when sequential updates are performed (otherwise encountered in using Dirichlet prior). We refer to the proposed model as the Dirichlet latent variable model (DLVM). We develop an expectation maximization algorithm for the proposed model, and also derive a maximum a posteriori estimate of the parameters. Furthermore, we connect the proposed DLVM to two popular latent basis learning methods - probabilistic latent component analysis (PLCA) and non-negative matrix factorization (NMF).We show that (i) PLCA is a special case of our DLVM, and (ii) DLVM can be interpreted as a dynamic version of NMF. The usefulness of DLVM is demonstrated for three audio processing applications - speaker source separation, denoising, and bandwidth expansion. To this end, a new algorithm for source separation is also proposed. Through extensive experiments on benchmark databases, we show that the proposed model out performs several relevant existing methods in all three applications
- …