Search CORE

1 research outputs found

Cross likelihood ratio based speaker clustering using eigenvoice models

Author: Dean David
Sridharan Sridha
Vogt Robert
Wang David
Publication venue: 'International Speech Communication Association'
Publication date: 01/01/2011
Field of study

This paper proposes the use of eigenvoice modeling techniques with the Cross Likelihood Ratio (CLR) as a criterion for speaker clustering within a speaker diarization system. The CLR has previously been shown to be a robust decision criterion for speaker clustering using Gaussian Mixture Models. Recently, eigenvoice modeling techniques have become increasingly popular, due to its ability to adequately represent a speaker based on sparse training data, as well as an improved capture of differences in speaker characteristics. This paper hence proposes that it would be beneficial to capitalize on the advantages of eigenvoice modeling in a CLR framework. Results obtained on the 2002 Rich Transcription (RT-02) Evaluation dataset show an improved clustering performance, resulting in a 35.1% relative improvement in the overall Diarization Error Rate (DER) compared to the baseline system

CiteSeerX

Queensland University of Technology ePrints Archive