7,007 research outputs found
Sentence-Level Content Planning and Style Specification for Neural Text Generation
Building effective text generation systems requires three critical
components: content selection, text planning, and surface realization, and
traditionally they are tackled as separate problems. Recent all-in-one style
neural generation models have made impressive progress, yet they often produce
outputs that are incoherent and unfaithful to the input. To address these
issues, we present an end-to-end trained two-step generation model, where a
sentence-level content planner first decides on the keyphrases to cover as well
as a desired language style, followed by a surface realization decoder that
generates relevant and coherent text. For experiments, we consider three tasks
from domains with diverse topics and varying language styles: persuasive
argument construction from Reddit, paragraph generation for normal and simple
versions of Wikipedia, and abstract generation for scientific articles.
Automatic evaluation shows that our system can significantly outperform
competitive comparisons. Human judges further rate our system generated text as
more fluent and correct, compared to the generations by its variants that do
not consider language style.Comment: Accepted as a long paper to EMNLP 201
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation
This paper surveys the current state of the art in Natural Language
Generation (NLG), defined as the task of generating text or speech from
non-linguistic input. A survey of NLG is timely in view of the changes that the
field has undergone over the past decade or so, especially in relation to new
(usually data-driven) methods, as well as new applications of NLG technology.
This survey therefore aims to (a) give an up-to-date synthesis of research on
the core tasks in NLG and the architectures adopted in which such tasks are
organised; (b) highlight a number of relatively recent research topics that
have arisen partly as a result of growing synergies between NLG and other areas
of artificial intelligence; (c) draw attention to the challenges in NLG
evaluation, relating them to similar challenges faced in other areas of Natural
Language Processing, with an emphasis on different evaluation methods and the
relationships between them.Comment: Published in Journal of AI Research (JAIR), volume 61, pp 75-170. 118
pages, 8 figures, 1 tabl
Aspect-Controlled Neural Argument Generation
We rely on arguments in our daily lives to deliver our opinions and base them
on evidence, making them more convincing in turn. However, finding and
formulating arguments can be challenging. In this work, we train a language
model for argument generation that can be controlled on a fine-grained level to
generate sentence-level arguments for a given topic, stance, and aspect. We
define argument aspect detection as a necessary method to allow this
fine-granular control and crowdsource a dataset with 5,032 arguments annotated
with aspects. Our evaluation shows that our generation model is able to
generate high-quality, aspect-specific arguments. Moreover, these arguments can
be used to improve the performance of stance detection models via data
augmentation and to generate counter-arguments. We publish all datasets and
code to fine-tune the language model
Long and Diverse Text Generation with Planning-based Hierarchical Variational Model
Existing neural methods for data-to-text generation are still struggling to
produce long and diverse texts: they are insufficient to model input data
dynamically during generation, to capture inter-sentence coherence, or to
generate diversified expressions. To address these issues, we propose a
Planning-based Hierarchical Variational Model (PHVM). Our model first plans a
sequence of groups (each group is a subset of input items to be covered by a
sentence) and then realizes each sentence conditioned on the planning result
and the previously generated context, thereby decomposing long text generation
into dependent sentence generation sub-tasks. To capture expression diversity,
we devise a hierarchical latent structure where a global planning latent
variable models the diversity of reasonable planning and a sequence of local
latent variables controls sentence realization. Experiments show that our model
outperforms state-of-the-art baselines in long and diverse text generation.Comment: To appear in EMNLP 201
Dialogue Management and Language Generation for a Robust Conversational Virtual Coach: Validation and User Study
Designing human–machine interactive systems requires cooperation between different disciplines is required. In this work, we present a Dialogue Manager and a Language Generator that are the core modules of a Voice-based Spoken Dialogue System (SDS) capable of carrying out challenging, long and complex coaching conversations. We also develop an efficient integration procedure of the whole system that will act as an intelligent and robust Virtual Coach. The coaching task significantly differs from the classical applications of SDSs, resulting in a much higher degree of complexity and difficulty. The Virtual Coach has been successfully tested and validated in a user study with independent elderly, in three different countries with three different languages and cultures: Spain, France and Norway.The research presented in this paper has been conducted as part of the project EMPATHIC that has received funding from the European Union’s Horizon 2020 research and innovation programme under Grant No. 769872. Additionally, this work has been partially funded by projects BEWORD and AMIC-PC of the Minister of Science of Technology, under Grant Nos. PID2021-126061OB-C42 and PDC2021-120846-C43, respectively. Vázquez and López Zorrilla received a PhD scholarship from the Basque Government, with Grant Nos. PRE 2020 1 0274 and PRE 2017 1 0357, respectively
- …