Detection and handling of overlapping speech for speaker diarization

Hernando Pericás, Francisco Javier; Zelenak, Martin

unknown

Detection and handling of overlapping speech for speaker diarization

Authors: Francisco Javier Hernando Pericás
Martin Zelenak
Publication date: 1 January 2012
Publisher

Abstract

This thesis concerns the detection of overlapping speech segments and its further application for the improvement of speaker diarization performance. We propose the use of three spatial cross-correlation-based parameters for overlap detection on distant microphone channel data. Spatial features from dierent microphone pairs are fused by means of principal component analysis or by an approach involving a multilayer perceptron. In addition, we investigate the possibility of employing long-term prosodic information. The most suitable subset of candidate prosodic features is determined by a two-step mRMR feature selection algorithm. For segments including detected overlapping speech the speaker diarization system picks a second speaker label, and such segments are also discarded from the model training. The proposed overlap labeling technique is integrated in the Viterbi-decoding part of the diarization algorithm.Peer ReviewedPostprint (published version

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

UPCommons

oai:upcommons.upc.edu:2117/180...

Last time updated on 17/04/2020

UPCommons. Portal del coneixement obert de la UPC

oai:upcommons.upc.edu:2117/180...

Last time updated on 16/06/2016