856 research outputs found
A Reference Architecture for Natural Language Generation Systems
We present the RAGS (Reference Architecture for Generation Systems) framework: a specification of an abstract Natural Language Generation (NLG) system architecture to support sharing, re-use, comparison and evaluation of NLG technologies. We argue that the evidence from a survey of actual NLG systems calls for a different emphasis in a reference proposal from that seen in similar initiatives in information extraction and multimedia interfaces.
We introduce the framework itself, in particular the two-level data model that allows us to support the complex data requirements of NLG systems in a flexible and coherent fashion, and describe our efforts to validate the framework through a range of implementations
Comprehension Driven Document Planning in Natural Language Generation Systems
This work is funded by the Engineering and Physical Sciences Research Council (EPSRC), under a National Productivity Investment Fund Doctoral Studentship (EP/R512412/1).Publisher PD
The E2E Dataset: New Challenges For End-to-End Generation
This paper describes the E2E data, a new dataset for training end-to-end,
data-driven natural language generation systems in the restaurant domain, which
is ten times bigger than existing, frequently used datasets in this area. The
E2E dataset poses new challenges: (1) its human reference texts show more
lexical richness and syntactic variation, including discourse phenomena; (2)
generating from this set requires content selection. As such, learning from
this dataset promises more natural, varied and less template-like system
utterances. We also establish a baseline on this dataset, which illustrates
some of the difficulties associated with this data.Comment: Accepted as a short paper for SIGDIAL 2017 (final submission
including supplementary material
Visualising Discourse Coherence in Non-Linear Documents
To produce coherent linear documents, Natural Language Generation systems have traditionally exploited the structuring role of textual discourse markers such as relational and referential phrases. These coherence markers of the traditional notion of text, however, do not work in non-linear documents: a new set of graphical devices is needed together with formation rules to govern their usage, supported by sound theoretical frameworks. If in linear documents graphical devices such as layout and formatting complement textual devices in the expression of discourse coherence, in non-linear documents they play a more important role. In this paper, we present our theoretical and empirical work in progress, which explores new possibilities for expressing coherence in the generation of hypertext documents
Mixing representation levels: The hybrid approach to automatic text generation
Natural language generation systems (NLG) map non-linguistic representations
into strings of words through a number of steps using intermediate
representations of various levels of abstraction. Template based systems, by
contrast, tend to use only one representation level, i.e. fixed strings, which
are combined, possibly in a sophisticated way, to generate the final text.
In some circumstances, it may be profitable to combine NLG and template based
techniques. The issue of combining generation techniques can be seen in more
abstract terms as the issue of mixing levels of representation of different
degrees of linguistic abstraction. This paper aims at defining a reference
architecture for systems using mixed representations. We argue that mixed
representations can be used without abandoning a linguistically grounded
approach to language generation.Comment: 6 page
Dynamic Human Evaluation for Relative Model Comparisons
Collecting human judgements is currently the most reliable evaluation method
for natural language generation systems. Automatic metrics have reported flaws
when applied to measure quality aspects of generated text and have been shown
to correlate poorly with human judgements. However, human evaluation is time
and cost-intensive, and we lack consensus on designing and conducting human
evaluation experiments. Thus there is a need for streamlined approaches for
efficient collection of human judgements when evaluating natural language
generation systems. Therefore, we present a dynamic approach to measure the
required number of human annotations when evaluating generated outputs in
relative comparison settings. We propose an agent-based framework of human
evaluation to assess multiple labelling strategies and methods to decide the
better model in a simulation and a crowdsourcing case study. The main results
indicate that a decision about the superior model can be made with high
probability across different labelling strategies, where assigning a single
random worker per task requires the least overall labelling effort and thus the
least cost.Comment: accepted at LREC 202
Underreporting of errors in NLG output, and what to do about it
We observe a severe under-reporting of the different kinds of errors that Natural Language Generation systems make. This is a problem, because mistakes are an important indicator of where systems should still be improved. If authors only report overall performance metrics, the research community is left in the dark about the specific weaknesses that are exhibited by `state-of-the-art' research. Next to quantifying the extent of error under-reporting, this position paper provides recommendations for error identification, analysis and reporting.Peer reviewe
- …