COMPARING MAXIMUM A POSTERIORI VECTOR QUANTIZATION AND GAUSSIAN MIXTURE MODELS IN SPEAKER VERIFICATION

Juhani Saastamoinen; Mikko Vinni; Pasi Fränti; Tomi Kinnunen; Ville Hautamäki

COMPARING MAXIMUM A POSTERIORI VECTOR QUANTIZATION AND GAUSSIAN MIXTURE MODELS IN SPEAKER VERIFICATION

Authors: Juhani Saastamoinen
Mikko Vinni
Pasi Fränti
Tomi Kinnunen
Ville Hautamäki
Publication date: 31 August 2009
Publisher
Doi

Abstract

Gaussian mixture model- universal background model (GMM-UBM) is a standard reference classifier in speaker verification. We have recently proposed a simplified model using vector quantization (VQ-UBM). In this study, we extensively compare these two classifiers on NIST 2005, 2006 and 2008 SRE corpora, while having a standard discriminative classifier (GLDS-SVM) as a reference point. We focus on parameter setting for N-top scoring, model order, and performance for different amounts of training data. The most interesting result, against a general belief, is that GMM-UBM yields better results for short segments whereas VQ-UBM is good for long utterances. The results also suggest that maximum likelihood training of the UBM is sub-optimal, and hence, alternative ways to train the UBM should be considered

Similar works

Full text

Available Versions

Crossref

Last time updated on 17/03/2019