Search CORE

434 research outputs found

Robust variational Bayesian clustering for underdetermined speech separation

Author: Zeinab Y. Zohny (7203902)
Publication venue
Publication date: 01/01/2016
Field of study

The main focus of this thesis is the enhancement of the statistical framework employed for underdetermined T-F masking blind separation of speech. While humans are capable of extracting a speech signal of interest in the presence of other interference and noise; actual speech recognition systems and hearing aids cannot match this psychoacoustic ability. They perform well in noise and reverberant free environments but suffer in realistic environments. Time-frequency masking algorithms based on computational auditory scene analysis attempt to separate multiple sound sources from only two reverberant stereo mixtures. They essentially rely on the sparsity that binaural cues exhibit in the time-frequency domain to generate masks which extract individual sources from their corresponding spectrogram points to solve the problem of underdetermined convolutive speech separation. Statistically, this can be interpreted as a classical clustering problem. Due to analytical simplicity, a finite mixture of Gaussian distributions is commonly used in T-F masking algorithms for modelling interaural cues. Such a model is however sensitive to outliers, therefore, a robust probabilistic model based on the Student's t-distribution is first proposed to improve the robustness of the statistical framework. This heavy tailed distribution, as compared to the Gaussian distribution, can potentially better capture outlier values and thereby lead to more accurate probabilistic masks for source separation. This non-Gaussian approach is applied to the state-of the-art MESSL algorithm and comparative studies are undertaken to confirm the improved separation quality. A Bayesian clustering framework that can better model uncertainties in reverberant environments is then exploited to replace the conventional expectation-maximization (EM) algorithm within a maximum likelihood estimation (MLE) framework. A variational Bayesian (VB) approach is then applied to the MESSL algorithm to cluster interaural phase differences thereby avoiding the drawbacks of MLE; specifically the probable presence of singularities and experimental results confirm an improvement in the separation performance. Finally, the joint modelling of the interaural phase and level differences and the integration of their non-Gaussian modelling within a variational Bayesian framework, is proposed. This approach combines the advantages of the robust estimation provided by the Student's t-distribution and the robust clustering inherent in the Bayesian approach. In other words, this general framework avoids the difficulties associated with MLE and makes use of the heavy tailed Student's t-distribution to improve the estimation of the soft probabilistic masks at various reverberation times particularly for sources in close proximity. Through an extensive set of simulation studies which compares the proposed approach with other T-F masking algorithms under different scenarios, a significant improvement in terms of objective and subjective performance measures is achieved

Loughborough University Institutional Repository

Dictionary Learning for Sparse Representations With Applications to Blind Source Separation.

Author: Xu Tao.
Publication venue: Guildford
Publication date: 14/05/2020
Field of study

During the past decade, sparse representation has attracted much attention in the signal processing community. It aims to represent a signal as a linear combination of a small number of elementary signals called atoms. These atoms constitute a dictionary so that a signal can be expressed by the multiplication of the dictionary and a sparse coefficients vector. This leads to two main challenges that are studied in the literature, i.e. sparse coding (find the coding coefficients based on a given dictionary) and dictionary design (find an appropriate dictionary to fit the data). Dictionary design is the focus of this thesis. Traditionally, the signals can be decomposed by the predefined mathematical transform, such as discrete cosine transform (DCT), which forms the so-called analytical approach. In recent years, learning-based methods have been introduced to adapt the dictionary from a set of training data, leading to the technique of dictionary learning. Although this may involve a higher computational complexity, learned dictionaries have the potential to offer improved performance as compared with predefined dictionaries. Dictionary learning algorithm is often achieved by iteratively executing two operations: sparse approximation and dictionary update. We focus on the dictionary update step, where the dictionary is optimized with a given sparsity pattern. A novel framework is proposed to generalize benchmark mechanisms such as the method of optimal directions (MOD) and K-SVD where an arbitrary set of codewords and the corresponding sparse coefficients are simultaneously updated, hence the term simultaneous codeword optimization (SimCO). Moreover, its extended formulation ‘regularized SimCO’ mitigates the major bottleneck of dictionary update caused by the singular points. First and second order optimization procedures are designed to solve the primitive and regularized SimCO. In addition, a tree-structured multi-level representation of dictionary based on clustering is used to speed up the optimization process in the sparse coding stage. This novel dictionary learning algorithm is also applied for solving the underdetermined blind speech separation problem, leading to a multi-stage method, where the separation problem is reformulated as a sparse coding problem, with the dictionary being learned by an adaptive algorithm. Using mutual coherence and sparsity index, the performance of a variety of dictionaries for underdetermined speech separation is compared and analyzed, such as the dictionaries learned from speech mixtures and ground truth speech sources, as well as those predefined by mathematical transforms. Finally, we propose a new method for joint dictionary learning and source separation. Different from the multistage method, the proposed method can simultaneously estimate the mixing matrix, the dictionary and the sources in an alternating and blind manner. The advantages of all the proposed methods are demonstrated over the state-of-the-art methods using extensive numerical tests

University of Surrey

Algorithms for Source Separation - with Cocktail Party Applications

Author: Olsson Rasmus Kongsgaard
Publication venue
Publication date: 01/11/2007
Field of study

Online Research Database In Technology

Presentation of a highly tuned multithreaded interval solver for underdetermined and well-determined nonlinear systems

Author: A Goldsztejn
Bartłomiej Jacek Kubica
BJ Kubica
BJ Kubica
BJ Kubica
BJ Kubica
BJ Kubica
BJ Kubica
BJ Kubica
D Ishii
E Hansen
F Goualard
H Ratschek
JF Puget
KH Meyn
L Granvilliers
L Granvilliers
LV Kolev
RB Kearfott
SP Shary
WC Rheinboldt
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Exploiting spatial sparsity for multi-wavelength imaging in optical interferometry

Author: Afonso
Afonso
Baron
Bongard
Bourguignon
Boyd
Delplancke
Donoho
Eckstein
Ferréol Soulez
Fessler
Figueiredo
Fornasier
Frigo
Gillessen
Hestenes
Hofmann
Högbom
Ireland
Jacoby
Jenatton
Kowalski
le Bouquin
Loïc Denis
Marsh
Meimon
Schwarz
Thiébaut
Thiébaut
Thiébaut
Thiébaut
Thiébaut
Vincent
Wang
Wright
Yin
Yuan
Éric Thiébaut
Publication venue: 'The Optical Society'
Publication date: 18/12/2012
Field of study

Optical interferometers provide multiple wavelength measurements. In order to fully exploit the spectral and spatial resolution of these instruments, new algorithms for image reconstruction have to be developed. Early attempts to deal with multi-chromatic interferometric data have consisted in recovering a gray image of the object or independent monochromatic images in some spectral bandwidths. The main challenge is now to recover the full 3-D (spatio-spectral) brightness distribution of the astronomical target given all the available data. We describe a new approach to implement multi-wavelength image reconstruction in the case where the observed scene is a collection of point-like sources. We show the gain in image quality (both spatially and spectrally) achieved by globally taking into account all the data instead of dealing with independent spectral slices. This is achieved thanks to a regularization which favors spatial sparsity and spectral grouping of the sources. Since the objective function is not differentiable, we had to develop a specialized optimization algorithm which also accounts for non-negativity of the brightness distribution.Comment: This version has been accepted for publication in J. Opt. Soc. Am.

arXiv.org e-Print Archive

Parallel Image Reconstruction Systems for Magnetic Induction Tomography

Author: Yasheng Maimaitijiang
Publication venue
Publication date: 02/03/2009
Field of study

University of South Wales Research Explorer

Recommended from our members

Model updating in structural dynamics: advanced parametrization, optimal regularization, and symmetry considerations

Author: Bartilson Daniel Thomas
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2019
Field of study

Numerical models are pervasive tools in science and engineering for simulation, design, and assessment of physical systems. In structural engineering, finite element (FE) models are extensively used to predict responses and estimate risk for built structures. While FE models attempt to exactly replicate the physics of their corresponding structures, discrepancies always exist between measured and model output responses. Discrepancies are related to aleatoric uncertainties, such as measurement noise, and epistemic uncertainties, such as modeling errors. Epistemic uncertainties indicate that the FE model may not fully represent the built structure, greatly limiting its utility for simulation and structural assessment. Model updating is used to reduce error between measurement and model-output responses through adjustment of uncertain FE model parameters, typically using data from structural vibration studies. However, the model updating problem is often ill-posed with more unknown parameters than available data, such that parameters cannot be uniquely inferred from the data. This dissertation focuses on two approaches to remedy ill-posedness in FE model updating: parametrization and regularization. Parametrization produces a reduced set of updating parameters to estimate, thereby improving posedness. An ideal parametrization should incorporate model uncertainties, effectively reduce errors, and use as few parameters as possible. This is a challenging task since a large number of candidate parametrizations are available in any model updating problem. To ameliorate this, three new parametrization techniques are proposed: improved parameter clustering with residual-based weighting, singular vector decomposition-based parametrization, and incremental reparametrization. All of these methods utilize local system sensitivity information, providing effective reduced-order parametrizations which incorporate FE model uncertainties. The other focus of this dissertation is regularization, which improves posedness by providing additional constraints on the updating problem, such as a minimum-norm parameter solution constraint. Optimal regularization is proposed for use in model updating to provide an optimal balance between residual reduction and parameter change minimization. This approach links computationally-efficient deterministic model updating with asymptotic Bayesian inference to provide regularization based on maximal model evidence. Estimates are also provided for uncertainties and model evidence, along with an interesting measure of parameter efficiency

Columbia University Academic Commons