Search CORE

1 research outputs found

Highly-inflected Language Generation using Factored Language Models

Author: A De Novais
Diogo Takaki Ferreira
Eder Mir
Ré Paraboni
Publication venue
Publication date
Field of study

Abstract. Statistical language models based on n-gram counts have been shown to successfully replace grammar rules in standard 2-stage (or ‘generate-and-select’) Natural Language Generation (NLG). In highlyinflected languages, however, the amount of training data required to cope with n-gram sparseness may be simply unobtainable, and the benefits of a statistical approach become less obvious. In this work we address the issue of text generation in a highly-inflected language by making use of factored language models (FLM) that take morphological information into account. We present a number of experiments involving the use of simple FLMs applied to various surface realisation tasks, showing that FLMs may implement 2-stage generation with results that are far superior to standard n-gram models alone

CiteSeerX