Search CORE

127 research outputs found

Multimodal methods for blind source separation of audio sources

Author: Syed M.R. Naqvi (7200659)
Publication venue
Publication date: 01/01/2009
Field of study

The enhancement of the performance of frequency domain convolutive blind source separation (FDCBSS) techniques when applied to the problem of separating audio sources recorded in a room environment is the focus of this thesis. This challenging application is termed the cocktail party problem and the ultimate aim would be to build a machine which matches the ability of a human being to solve this task. Human beings exploit both their eyes and their ears in solving this task and hence they adopt a multimodal approach, i.e. they exploit both audio and video modalities. New multimodal methods for blind source separation of audio sources are therefore proposed in this work as a step towards realizing such a machine. The geometry of the room environment is initially exploited to improve the separation performance of a FDCBSS algorithm. The positions of the human speakers are monitored by video cameras and this information is incorporated within the FDCBSS algorithm in the form of constraints added to the underlying cross-power spectral density matrix-based cost function which measures separation performance. [Continues.

Loughborough University Institutional Repository

A multimodal approach to blind source separation of moving sources

Author: Jonathon Chambers (1251609)
Miao Yu (1252284)
Mohsen Naqvi (1252812)
Publication venue
Publication date: 01/01/2010
Field of study

A novel multimodal approach is proposed to solve the problem of blind source separation (BSS) of moving sources. The challenge of BSS for moving sources is that the mixing filters are time varying; thus, the unmixing filters should also be time varying, which are difficult to calculate in real time. In the proposed approach, the visual modality is utilized to facilitate the separation for both stationary and moving sources. The movement of the sources is detected by a 3-D tracker based on video cameras. Positions and velocities of the sources are obtained from the 3-D tracker based on a Markov Chain Monte Carlo particle filter (MCMC-PF), which results in high sampling efficiency. The full BSS solution is formed by integrating a frequency domain blind source separation algorithm and beamforming: if the sources are identified as stationary for a certain minimum period, a frequency domain BSS algorithm is implemented with an initialization derived from the positions of the source signals. Once the sources are moving, a beamforming algorithm which requires no prior statistical knowledge is used to perform real time speech enhancement and provide separation of the sources. Experimental results confirm that by utilizing the visual modality, the proposed algorithm not only improves the performance of the BSS algorithm and mitigates the permutation problem for stationary sources, but also provides a good BSS performance for moving sources in a low reverberant environment

Loughborough University Institutional Repository

Crossref

Surrey Research Insight

Convolutive Blind Source Separation Methods

Author: Kjems Ulrik
Larsen Jan
Parra Lucas C.
Pedersen Michael Syskind
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2008
Field of study

In this chapter, we provide an overview of existing algorithms for blind source separation of convolutive audio mixtures. We provide a taxonomy, wherein many of the existing algorithms can be organized, and we present published results from those algorithms that have been applied to real-world audio separation tasks

CiteSeerX

Online Research Database In Technology

Source Separation for Hearing Aid Applications

Author: Pedersen Michael Syskind
Publication venue: Technical University of Denmark
Publication date: 01/11/2006
Field of study

Online Research Database In Technology

Video-aided model-based source separation in real reverberant rooms

Author: Ata ur-Rehman (7200926)
Jonathon Chambers (1251609)
Mohsen Naqvi (1252812)
Muhammad Salman Khan (7202543)
Wenwu Wang (4352767)
Publication venue
Publication date: 01/01/2013
Field of study

Source separation algorithms that utilize only audio data can perform poorly if multiple sources or reverberation are present. In this paper we therefore propose a video-aided model-based source separation algorithm for a two-channel reverberant recording in which the sources are assumed static. By exploiting cues from video, we first localize individual speech sources in the enclosure and then estimate their directions. The interaural spatial cues, the interaural phase difference and the interaural level difference, as well as the mixing vectors are probabilistically modeled. The models make use of the source direction information and are evaluated at discrete timefrequency points. The model parameters are refined with the wellknown expectation-maximization (EM) algorithm. The algorithm outputs time-frequency masks that are used to reconstruct the individual sources. Simulation results show that by utilizing the visual modality the proposed algorithm can produce better timefrequency masks thereby giving improved source estimates. We provide experimental results to test the proposed algorithm in different scenarios and provide comparisons with both other audio-only and audio-visual algorithms and achieve improved performance both on synthetic and real data. We also include dereverberation based pre-processing in our algorithm in order to suppress the late reverberant components from the observed stereo mixture and further enhance the overall output of the algorithm. This advantage makes our algorithm a suitable candidate for use in under-determined highly reverberant settings where the performance of other audio-only and audio-visual methods is limited

Loughborough University Institutional Repository

Surrey Research Insight

Perceptually motivated blind source separation of convolutive audio mixtures

Author: Guddeti Ram Mohana Reddy
Publication venue: The University of Edinburgh
Publication date: 01/01/2005
Field of study

Edinburgh Research Archive

Use of Bimodal Coherence to Resolve Spectral Indeterminacy in Convolutive BSS

Author: A. Hyvärinen
B. Rivet
C. Jutten
D. Sodoyer
J. Thomas
J.F. Cardoso
P. Comon
Publication venue
Publication date: 01/01/2010
Field of study

Recent studies show that visual information contained in visual speech can be helpful for the performance enhancement of audio-only blind source separation (BSS) algorithms. Such information is exploited through the statistical characterisation of the coherence between the audio and visual speech using, e.g. a Gaussian mixture model (GMM). In this paper, we present two new contributions. An adapted expectation maximization (AEM) algorithm is proposed in the training process to model the audio-visual coherence upon the extracted features. The coherence is exploited to solve the permutation problem in the frequency domain using a new sorting scheme. We test our algorithm on the XM2VTS multimodal database. The experimental results show that our proposed algorithm outperforms traditional audio-only BSS

CiteSeerX

Crossref

University of Surrey

Surrey Research Insight

Real Time Blind Source Separation in Reverberant Environments

Author: Sherry Timothy
Publication venue: 'Victoria University of Wellington Library'
Publication date: 01/01/2014
Field of study

An online convolutive blind source separation solution has been developed for use in reverberant environments with stationary sources. Results are presented for simulation and real world data. The system achieves a separation SINR of 16.8 dB when operating on a two source mixture, with a total acoustic delay was 270 ms. This is on par with, and in many respects outperforms various published algorithms [1],[2]. A number of instantaneous blind source separation algorithms have been developed, including a block wise and recursive ICA algorithm, and a clustering based algorithm, able to obtain up to 110 dB SIR performance. The system has been realised in both Matlab and C, and is modular, allowing for easy update of the ICA algorithm that is the core of the unmixing process

Victoria University of Wellington

ResearchArchive at Victoria University of Wellington

Analysis of dual-channel ICA-based blocking matrix for improved noise estimation

Author: A Hjørungnes
A Hyvärinen
A Krueger
A Lombard
A Ozerov
A Ozerov
B Cornelis
BD Van Veen
C Knapp
E Warsitz
E Weinstein
EAP Habets
H Buchner
H Buchner
H Buchner
H Kuttruff
H Sawada
HL Van Trees
I Cohen
I Cohen
I McCowan
K Jeon
K Kim
K Reindl
K Reindl
K Reindl
L Parra
LJ Griffiths
M Taseska
M Taseska
N Ito
N Moritz
NK Duong
O Hoshuyama
OL Frost
P Smaragdis
P Vary
PP Vaidyanathan
R Aichner
R Maas
R Martin
R Martin
R Talmon
R Zelinski
S Araki
S Araki
S Gannot
S Golan
S Jeong
S Makino
SV Gerven
T Gerkmann
T R Hendriks
T Van den Bogaert
W Herbordt
W Kellermann
Y Takahashi
Y Zheng
Y Zheng
Ö Yilmaz
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref