Search CORE

1,226 research outputs found

Initialization method for speech separation algorithms that work in the time-frequency domain

Author: Cruces Álvarez Sergio Antonio
Durán Díaz Iván
Sarmiento Vega María Auxiliadora
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/01/2010
Field of study

This article addresses the problem of the unsupervised separa tion of speech signals in realistic scenarios. An initialization procedure is proposed for independent component analysis (ICA) algorithms that work in the time-frequency domain and require the prewhitening of the observations. It is shown that the proposed method drastically reduces the permuted solu tions in that domain and helps to reduce the execution time of the algorithms. Simulations confirm these advantages for several ICA instantaneous algo rithms and the effectiveness of the proposed technique in emulated reverber ant environments.Ministerio de Ciencia y tecnología (España) TEC2008-0625

idUS. Depósito de Investigación Universidad de Sevilla

A Study of Methods for Initialization and Permutation Alignment for Time-Frequency Domain Blind Source Separation

Author: Auxiliadora Sarmiento
Iván Durán
Pablo Aguilera
Sergio Cruces
Publication venue: 'IntechOpen'
Publication date: 10/10/2012
Field of study

IntechOpen

Multimodal methods for blind source separation of audio sources

Author: Syed M.R. Naqvi (7200659)
Publication venue
Publication date: 01/01/2009
Field of study

The enhancement of the performance of frequency domain convolutive blind source separation (FDCBSS) techniques when applied to the problem of separating audio sources recorded in a room environment is the focus of this thesis. This challenging application is termed the cocktail party problem and the ultimate aim would be to build a machine which matches the ability of a human being to solve this task. Human beings exploit both their eyes and their ears in solving this task and hence they adopt a multimodal approach, i.e. they exploit both audio and video modalities. New multimodal methods for blind source separation of audio sources are therefore proposed in this work as a step towards realizing such a machine. The geometry of the room environment is initially exploited to improve the separation performance of a FDCBSS algorithm. The positions of the human speakers are monitored by video cameras and this information is incorporated within the FDCBSS algorithm in the form of constraints added to the underlying cross-power spectral density matrix-based cost function which measures separation performance. [Continues.

Loughborough University Institutional Repository

Blind separation of underdetermined mixtures with additive white and pink noises

Author: Alshabrawy Ossama
Awad W. A.
Hassanien Aboul ella
Salama A. A.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 10/10/2014
Field of study

This paper presents an approach for underdetermined blind source separation in the case of additive Gaussian white noise and pink noise. Likewise, the proposed approach is applicable in the case of separating I + 3 sources from I mixtures with additive two kinds of noises. This situation is more challenging and suitable to practical real world problems. Moreover, unlike to some conventional approaches, the sparsity conditions are not imposed. Firstly, the mixing matrix is estimated based on an algorithm that combines short time Fourier transform and rough-fuzzy clustering. Then, the mixed signals are normalized and the source signals are recovered using modified Gradient descent Local Hierarchical Alternating Least Squares Algorithm exploiting the mixing matrix obtained from the previous step as an input and initialized by multiplicative algorithm for matrix factorization based on alpha divergence. The experiments and simulation results show that the proposed approach can separate I + 3 source signals from I mixed signals, and it has superior evaluation performance compared to some conventional approaches

Northumbria Research Link

Crossref

A Blind Source Separation Framework for Ego-Noise Reduction on Multi-Rotor Drones

Author: Cavallaro A
Wang L
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

Crossref

Queen Mary Research Online

Enhanced independent vector analysis for audio separation in a room environment

Author: Yanfeng Liang (7202699)
Publication venue
Publication date: 01/01/2013
Field of study

Independent vector analysis (IVA) is studied as a frequency domain blind source separation method, which can theoretically avoid the permutation problem by retaining the dependency between different frequency bins of the same source vector while removing the dependency between different source vectors. This thesis focuses upon improving the performance of independent vector analysis when it is used to solve the audio separation problem in a room environment. A specific stability problem of IVA, i.e. the block permutation problem, is identified and analyzed. Then a robust IVA method is proposed to solve this problem by exploiting the phase continuity of the unmixing matrix. Moreover, an auxiliary function based IVA algorithm with an overlapped chain type source prior is proposed as well to mitigate this problem. Then an informed IVA scheme is proposed which combines the geometric information of the sources from video to solve the problem by providing an intelligent initialization for optimal convergence. The proposed informed IVA algorithm can also achieve a faster convergence in terms of iteration numbers and better separation performance. A pitch based evaluation method is defined to judge the separation performance objectively when the information describing the mixing matrix and sources is missing. In order to improve the separation performance of IVA, an appropriate multivariate source prior is needed to better preserve the dependency structure within the source vectors. A particular multivariate generalized Gaussian distribution is adopted as the source prior. The nonlinear score function derived from this proposed source prior contains the fourth order relationships between different frequency bins, which provides a more informative and stronger dependency structure compared with the original IVA algorithm and thereby improves the separation performance. Copula theory is a central tool to model the nonlinear dependency structure. The t copula is proposed to describe the dependency structure within the frequency domain speech signals due to its tail dependency property, which means if one variable has an extreme value, other variables are expected to have extreme values. A multivariate student's t distribution constructed by using a t copula with the univariate student's t marginal distribution is proposed as the source prior. Then the IVA algorithm with the proposed source prior is derived. The proposed algorithms are tested with real speech signals in different reverberant room environments both using modelled room impulse response and real room recordings. State-of-the-art criteria are used to evaluate the separation performance, and the experimental results confirm the advantage of the proposed algorithms

Loughborough University Institutional Repository