Age and gender recognition for telephone applications based on GMM supervectors and support vector machines

Abstract

This paper compares two approaches of automatic age and gen-der classification with 7 classes. The first approach are Gaus-sian Mixture Models (GMMs) with Universal Background Models (UBMs), which is well known for the task of speaker identifica-tion/verification. The training is performed by the EM algorithm or MAP adaptation respectively. For the second approach for each speaker of the test and training set a GMM model is trained. The means of each model are extracted and concatenated, which results in a GMM supervector for each speaker. These supervectors are then used in a support vector machine (SVM). Three different ker-nels were employed for the SVM approach: a polynomial kernel (with different polynomials), an RBF kernel and a linear GMM dis-tance kernel, based on the KL divergence. With the SVM approach we improved the recognition rate to 74 % (p < 0.001) and are in the same range as humans. Index Terms β€” Acoustic signal analysis, speaker classification, age, gender, Gaussian mixture models (GMM), support vector ma-chine (SVM) 1

    Similar works

    Full text

    thumbnail-image

    Available Versions

    Last time updated on 01/04/2019