Skip to main content
Article thumbnail
Location of Repository

I. ACOUSTIC AND LANGUAGE MODELS IN SPEECH RECOGNITION

By Damianos Karakos, Haolang Zhou, Puyang Xu, Sanjeev Khudanpur and Andreas G. Andreou

Abstract

state-of-the-art speech recognition systems use the well-known maximum aposteriori rule ˆW = arg max P (A|W)P (W), W for predicting the uttered word sequence W, given the acoustic information A. The acoustic model is represented by the conditional distribution P (A|W), while the language model is represented by the prior P (W), and the bulk of the research in speech recognition is on training procedures for these two components [1]. Acoustic modeling is usually done in a computationally efficient way, using the maximum likelihood criterion within the parametric family of Gaussian mixtures. The feature space of state-of-the-art systems is typically in the range of hundreds of dimensions—this is the result of concatenating togethe

Year: 2013
OAI identifier: oai:CiteSeerX.psu:10.1.1.352.3696
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://old-site.clsp.jhu.edu/~... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.