Location of Repository

Using KL-based Acoustic Models in a Large Vocabulary Recognition Task

By Guillermo Aradilla A, Mathew Magimai Doss A, Hervé Bourlard A, Guillermo Aradilla, Hervé Bourlard and Mathew Magimai Doss

Abstract

Abstract. Posterior probabilities of sub-word units have been shown to be an effective front-end for ASR. However, attempts to model this type of features either do not benefit from modeling context-dependent phonemes, or use an inefficient distribution to estimate the state likelihood. This paper presents a novel acoustic model for posterior features that overcomes these limitations. The proposed model can be seen as a HMM where the score associated with each state is the KL divergence between a distribution characterizing the state and the posterior features from the test utterance. This KL-based acoustic model establishes a framework where other models for posterior features such as hybrid HMM/MLP and discrete HMM can be seen as particular cases. Experiments on the WSJ database show that the KL-based acoustic model can significantly outperform these latter approaches. Moreover, the proposed model can obtain comparable results to complex systems, such as HMM/GMM, using significantly fewer parameters. 2 IDIAP–RR 08-14

Year: 2008
OAI identifier: oai:CiteSeerX.psu:10.1.1.415.1372
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://www.idiap.ch/ftp/report... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.