2 research outputs found

    Gaussian mixture models design and applications

    Get PDF
    Ankara : Department of Electrical and Electronics Engineering and the Institute of Engineering and Sciences of Bilkent Univ., 2000.Thesis (Master's) -- Bilkent University, 2000.Includes bibliographical references leaves 49-52.Two new design algorithms for estimating the parameters of Gaussian Mixture Models (GMh-l) are developed. These algorithms are based on fitting a GMM on the histogram of the data. The first method uses Least Squares Error (LSE) estimation with Gaus,s-Newton optimization technique to provide more accurate GMM parameter estimates than the commonl}' used ExpectationMaximization (EM) algorithm based estimates. The second method employs the matching pursuit algorithm which is based on finding the Gaussian functions that best match the individual components of a GMM from an overcomplete set. This algorithm provides a fast method for obtaining GMM parameter estimates. The proposed methods can be used to model the distribution of a large set of arbitrary random variables. Application of GMMs in human skin color density modeling and speaker recognition is considered. For speaker recognition, a new set of speech fiiature jmrameters is developed. The suggested set is more appropriate for speaker recognition applications than the widely used Mel-scale based one.Ben Fatma, KhaledM.S

    Discriminative Mixture Weight Estimation For Large Gaussian Mixture Models

    No full text
    This paper describes a new approach to acoustic modeling for large vocabulary continuous speech recognition (LVCSR) systems. Each phone is modeled with a large Gaussian mixture model (GMM) whose context-dependent mixture weights are estimated with a sentence-level discriminative training criterion. The estimation problem is casted in a neural network framework, which enables the incorporation of the appropriate constraints on the mixture weight vectors, and allows a straight-forward training procedure, based on steepest descent. Experiments conducted on the Callhome-English and Switchboard databases show a significant improvement of the acoustic model performance, and a somewhat lesser improvement with the combined acoustic and language models. 1. INTRODUCTION Many factors contribute to the relatively high error rates observed in LVCSR systems (e.g. diversity of speaking styles, pronunciation variants, variable degrees of articulation, noises, channel effects). By enlarging the set ..
    corecore