Location of Repository

Highly-inflected Language Generation using Factored Language Models

By Eder Mir, A De Novais, Ré Paraboni and Diogo Takaki Ferreira

Abstract

Abstract. Statistical language models based on n-gram counts have been shown to successfully replace grammar rules in standard 2-stage (or ‘generate-and-select’) Natural Language Generation (NLG). In highlyinflected languages, however, the amount of training data required to cope with n-gram sparseness may be simply unobtainable, and the benefits of a statistical approach become less obvious. In this work we address the issue of text generation in a highly-inflected language by making use of factored language models (FLM) that take morphological information into account. We present a number of experiments involving the use of simple FLMs applied to various surface realisation tasks, showing that FLMs may implement 2-stage generation with results that are far superior to standard n-gram models alone

Topics: Key words, Text Generation, Surface Realisation, Language Modelling
Year: 2014
OAI identifier: oai:CiteSeerX.psu:10.1.1.412.5049
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://each.uspnet.usp.br/ivan... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.