Location of Repository

A Hybrid Language Model based on Stochastic Context-free Grammars ⋆

By Diego Linares, José-miguel Benedí, Joan-andreu Sánchez and Javeriana Cali

Abstract

Abstract. This paper explores the use of initial Stochastic Context-Free Grammars (SCFG) obtained from a treebank corpus for the learning of SCFG by means of estimation algorithms. A hybrid language model is defined as a combination of a word-based n-gram, which is used to capture the local relations between words, and a category-based SCFG with a word distribution into categories, which is defined to represent the long-term relations between these categories. Experiments on the UPenn Treebank corpus are reported. These experiments have been carried out in terms of the test set perplexity and the word error rate in a speech recognition experiment.

Year: 2011
OAI identifier: oai:CiteSeerX.psu:10.1.1.183.6527
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://prhlt.iti.es/papers/200... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.