2 research outputs found
Gaussian mixture models design and applications
Ankara : Department of Electrical and Electronics Engineering and the Institute of Engineering and Sciences of Bilkent Univ., 2000.Thesis (Master's) -- Bilkent University, 2000.Includes bibliographical references leaves 49-52.Two new design algorithms for estimating the parameters of Gaussian Mixture
Models (GMh-l) are developed. These algorithms are based on fitting a
GMM on the histogram of the data. The first method uses Least Squares Error
(LSE) estimation with Gaus,s-Newton optimization technique to provide more
accurate GMM parameter estimates than the commonl}' used ExpectationMaximization
(EM) algorithm based estimates. The second method employs
the matching pursuit algorithm which is based on finding the Gaussian functions
that best match the individual components of a GMM from an overcomplete
set. This algorithm provides a fast method for obtaining GMM parameter
estimates.
The proposed methods can be used to model the distribution of a large set of
arbitrary random variables. Application of GMMs in human skin color density
modeling and speaker recognition is considered. For speaker recognition, a
new set of speech fiiature jmrameters is developed. The suggested set is more
appropriate for speaker recognition applications than the widely used Mel-scale
based one.Ben Fatma, KhaledM.S
Discriminative Mixture Weight Estimation For Large Gaussian Mixture Models
This paper describes a new approach to acoustic modeling for large vocabulary continuous speech recognition (LVCSR) systems. Each phone is modeled with a large Gaussian mixture model (GMM) whose context-dependent mixture weights are estimated with a sentence-level discriminative training criterion. The estimation problem is casted in a neural network framework, which enables the incorporation of the appropriate constraints on the mixture weight vectors, and allows a straight-forward training procedure, based on steepest descent. Experiments conducted on the Callhome-English and Switchboard databases show a significant improvement of the acoustic model performance, and a somewhat lesser improvement with the combined acoustic and language models. 1. INTRODUCTION Many factors contribute to the relatively high error rates observed in LVCSR systems (e.g. diversity of speaking styles, pronunciation variants, variable degrees of articulation, noises, channel effects). By enlarging the set ..