Search CORE

1 research outputs found

Use of Bimodal Coherence to Resolve Spectral Indeterminacy in Convolutive BSS

Author: A. Hyvärinen
B. Rivet
C. Jutten
D. Sodoyer
J. Thomas
J.F. Cardoso
P. Comon
Publication venue
Publication date: 01/01/2010
Field of study

Recent studies show that visual information contained in visual speech can be helpful for the performance enhancement of audio-only blind source separation (BSS) algorithms. Such information is exploited through the statistical characterisation of the coherence between the audio and visual speech using, e.g. a Gaussian mixture model (GMM). In this paper, we present two new contributions. An adapted expectation maximization (AEM) algorithm is proposed in the training process to model the audio-visual coherence upon the extracted features. The coherence is exploited to solve the permutation problem in the frequency domain using a new sorting scheme. We test our algorithm on the XM2VTS multimodal database. The experimental results show that our proposed algorithm outperforms traditional audio-only BSS

CiteSeerX

Crossref

University of Surrey

Surrey Research Insight