495 research outputs found
Three Approaches to Generating Texts in Different Styles
Natural Language Generation (nlg) systems generate texts in English and other human languages from non-linguistic input data. Usually there are a large number of possible texts that can communicate the input data, and nlg systems must choose one of these. We argue that style can be used by nlg systems to choose between possible texts, and explore how this can be done by (1) explicit stylistic parameters, (2) imitating a genre style, and (3) imitating an individual’s style
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation
This paper surveys the current state of the art in Natural Language
Generation (NLG), defined as the task of generating text or speech from
non-linguistic input. A survey of NLG is timely in view of the changes that the
field has undergone over the past decade or so, especially in relation to new
(usually data-driven) methods, as well as new applications of NLG technology.
This survey therefore aims to (a) give an up-to-date synthesis of research on
the core tasks in NLG and the architectures adopted in which such tasks are
organised; (b) highlight a number of relatively recent research topics that
have arisen partly as a result of growing synergies between NLG and other areas
of artificial intelligence; (c) draw attention to the challenges in NLG
evaluation, relating them to similar challenges faced in other areas of Natural
Language Processing, with an emphasis on different evaluation methods and the
relationships between them.Comment: Published in Journal of AI Research (JAIR), volume 61, pp 75-170. 118
pages, 8 figures, 1 tabl
The interaction between voice and appearance in the embodiment of a robot tutor
Robot embodiment is, by its very nature, holistic and understanding how various aspects contribute to the user perception of the robot is non-trivial. A study is presented here that investigates whether there is an interaction effect between voice and other aspects of embodiment, such as movement and appearance, in a pedagogical setting. An on-line study was distributed to children aged 11–17 that uses a modified Godspeed questionnaire. We show an interaction effect between the robot embodiment and voice in terms of perceived lifelikeness of the robot. Politeness is a key strategy used in learning and teaching, and here an effect is also observed for perceived politeness. Interestingly, participants’ overall preference was for embodiment combinations that are deemed polite and more like a teacher, but are not necessarily the most lifelike. From these findings, we are able to inform the design of robotic tutors going forward
Referenceless Quality Estimation for Natural Language Generation
Traditional automatic evaluation measures for natural language generation
(NLG) use costly human-authored references to estimate the quality of a system
output. In this paper, we propose a referenceless quality estimation (QE)
approach based on recurrent neural networks, which predicts a quality score for
a NLG system output by comparing it to the source meaning representation only.
Our method outperforms traditional metrics and a constant baseline in most
respects; we also show that synthetic data helps to increase correlation
results by 21% compared to the base system. Our results are comparable to
results obtained in similar QE tasks despite the more challenging setting.Comment: Accepted as a regular paper to 1st Workshop on Learning to Generate
Natural Language (LGNL), Sydney, 10 August 201
Controlling Personality-Based Stylistic Variation with Neural Natural Language Generators
Natural language generators for task-oriented dialogue must effectively
realize system dialogue actions and their associated semantics. In many
applications, it is also desirable for generators to control the style of an
utterance. To date, work on task-oriented neural generation has primarily
focused on semantic fidelity rather than achieving stylistic goals, while work
on style has been done in contexts where it is difficult to measure content
preservation. Here we present three different sequence-to-sequence models and
carefully test how well they disentangle content and style. We use a
statistical generator, Personage, to synthesize a new corpus of over 88,000
restaurant domain utterances whose style varies according to models of
personality, giving us total control over both the semantic content and the
stylistic variation in the training data. We then vary the amount of explicit
stylistic supervision given to the three models. We show that our most explicit
model can simultaneously achieve high fidelity to both semantic and stylistic
goals: this model adds a context vector of 36 stylistic parameters as input to
the hidden state of the encoder at each time step, showing the benefits of
explicit stylistic supervision, even when the amount of training data is large.Comment: To appear at SIGDIAL 201
A Study of Automatic Metrics for the Evaluation of Natural Language Explanations
As transparency becomes key for robotics and AI, it will be necessary to
evaluate the methods through which transparency is provided, including
automatically generated natural language (NL) explanations. Here, we explore
parallels between the generation of such explanations and the much-studied
field of evaluation of Natural Language Generation (NLG). Specifically, we
investigate which of the NLG evaluation measures map well to explanations. We
present the ExBAN corpus: a crowd-sourced corpus of NL explanations for
Bayesian Networks. We run correlations comparing human subjective ratings with
NLG automatic measures. We find that embedding-based automatic NLG evaluation
methods, such as BERTScore and BLEURT, have a higher correlation with human
ratings, compared to word-overlap metrics, such as BLEU and ROUGE. This work
has implications for Explainable AI and transparent robotic and autonomous
systems.Comment: Accepted at EACL 202
- …