13,227 research outputs found
Wizard of Wikipedia: Knowledge-Powered Conversational agents
In open-domain dialogue intelligent agents should exhibit the use of
knowledge, however there are few convincing demonstrations of this to date. The
most popular sequence to sequence models typically "generate and hope" generic
utterances that can be memorized in the weights of the model when mapping from
input utterance(s) to output, rather than employing recalled knowledge as
context. Use of knowledge has so far proved difficult, in part because of the
lack of a supervised learning benchmark task which exhibits knowledgeable open
dialogue with clear grounding. To that end we collect and release a large
dataset with conversations directly grounded with knowledge retrieved from
Wikipedia. We then design architectures capable of retrieving knowledge,
reading and conditioning on it, and finally generating natural responses. Our
best performing dialogue models are able to conduct knowledgeable discussions
on open-domain topics as evaluated by automatic metrics and human evaluations,
while our new benchmark allows for measuring further improvements in this
important research direction
TransferTransfo: A Transfer Learning Approach for Neural Network Based Conversational Agents
We introduce a new approach to generative data-driven dialogue systems (e.g.
chatbots) called TransferTransfo which is a combination of a Transfer learning
based training scheme and a high-capacity Transformer model. Fine-tuning is
performed by using a multi-task objective which combines several unsupervised
prediction tasks. The resulting fine-tuned model shows strong improvements over
the current state-of-the-art end-to-end conversational models like memory
augmented seq2seq and information-retrieval models. On the privately held
PERSONA-CHAT dataset of the Conversational Intelligence Challenge 2, this
approach obtains a new state-of-the-art, with respective perplexity, Hits@1 and
F1 metrics of 16.28 (45 % absolute improvement), 80.7 (46 % absolute
improvement) and 19.5 (20 % absolute improvement).Comment: 6 pages, 2 figures, 2 tables, NeurIPS 2018 CAI Worksho
Logic2Text: High-Fidelity Natural Language Generation from Logical Forms
Previous works on Natural Language Generation (NLG) from structured data have
primarily focused on surface-level descriptions of record sequences. However,
for complex structured data, e.g., multi-row tables, it is often desirable for
an NLG system to describe interesting facts from logical inferences across
records. If only provided with the table, it is hard for existing models to
produce controllable and high-fidelity logical generations. In this work, we
formulate logical level NLG as generation from logical forms in order to obtain
controllable, high-fidelity, and faithful generations. We present a new
large-scale dataset, \textsc{Logic2Text}, with 10,753 descriptions involving
common logic types paired with the underlying logical forms. The logical forms
show diversified graph structure of free schema, which poses great challenges
on the model's ability to understand the semantics. We experiment on (1)
Fully-supervised training with the full datasets, and (2) Few-shot setting,
provided with hundreds of paired examples; We compare several popular
generation models and analyze their performances. We hope our dataset can
encourage research towards building an advanced NLG system capable of natural,
faithful, and human-like generation. The dataset and code are available at
https://github.com/czyssrs/Logic2Text.Comment: Findings of EMNLP 2020, 9 pages, 6 figure
Few-Shot NLG with Pre-Trained Language Model
Neural-based end-to-end approaches to natural language generation (NLG) from
structured data or knowledge are data-hungry, making their adoption for
real-world applications difficult with limited data. In this work, we propose
the new task of \textit{few-shot natural language generation}. Motivated by how
humans tend to summarize tabular data, we propose a simple yet effective
approach and show that it not only demonstrates strong performance but also
provides good generalization across domains. The design of the model
architecture is based on two aspects: content selection from input data and
language modeling to compose coherent sentences, which can be acquired from
prior knowledge. With just 200 training examples, across multiple domains, we
show that our approach achieves very reasonable performances and outperforms
the strongest baseline by an average of over 8.0 BLEU points improvement. Our
code and data can be found at \url{https://github.com/czyssrs/Few-Shot-NLG}Comment: ACL 202
Towards Coherent and Engaging Spoken Dialog Response Generation Using Automatic Conversation Evaluators
Encoder-decoder based neural architectures serve as the basis of
state-of-the-art approaches in end-to-end open domain dialog systems. Since
most of such systems are trained with a maximum likelihood~(MLE) objective they
suffer from issues such as lack of generalizability and the generic response
problem, i.e., a system response that can be an answer to a large number of
user utterances, e.g., "Maybe, I don't know." Having explicit feedback on the
relevance and interestingness of a system response at each turn can be a useful
signal for mitigating such issues and improving system quality by selecting
responses from different approaches. Towards this goal, we present a system
that evaluates chatbot responses at each dialog turn for coherence and
engagement. Our system provides explicit turn-level dialog quality feedback,
which we show to be highly correlated with human evaluation. To show that
incorporating this feedback in the neural response generation models improves
dialog quality, we present two different and complementary mechanisms to
incorporate explicit feedback into a neural response generation model:
reranking and direct modification of the loss function during training. Our
studies show that a response generation model that incorporates these combined
feedback mechanisms produce more engaging and coherent responses in an
open-domain spoken dialog setting, significantly improving the response quality
using both automatic and human evaluation
SOLOIST: Building Task Bots at Scale with Transfer Learning and Machine Teaching
We present a new method SOLOIST that uses transfer learning and machine
teaching to build task bots at scale. We parameterize classical modular
task-oriented dialog systems using a Transformer-based auto-regressive language
model, which subsumes different dialog modules into a single neural model. We
pre-train, on heterogeneous dialog corpora, a task-grounded response generation
model, which can generate dialog responses grounded in user goals and
real-world knowledge for task completion. The pre-trained model can be
efficiently adapted to accomplish new tasks with a handful of task-specific
dialogs via machine teaching, where training samples are generated by human
teachers interacting with the system. Experiments show that (i) SOLOIST creates
new state-of-the-art on well-studied task-oriented dialog benchmarks, including
CamRest676 and MultiWOZ; (ii) in the few-shot fine-tuning settings, SOLOIST
significantly outperforms existing methods, and (iii) the use of machine
teaching substantially reduces the labeling cost of fine-tuning. The
pre-trained models and codes are available at https://aka.ms/soloist.Comment: 18 pages; To appear at TACL; Project Website: https://aka.ms/solois
The Cascade Transformer: an Application for Efficient Answer Sentence Selection
Large transformer-based language models have been shown to be very effective
in many classification tasks. However, their computational complexity prevents
their use in applications requiring the classification of a large set of
candidates. While previous works have investigated approaches to reduce model
size, relatively little attention has been paid to techniques to improve batch
throughput during inference. In this paper, we introduce the Cascade
Transformer, a simple yet effective technique to adapt transformer-based models
into a cascade of rankers. Each ranker is used to prune a subset of candidates
in a batch, thus dramatically increasing throughput at inference time. Partial
encodings from the transformer model are shared among rerankers, providing
further speed-up. When compared to a state-of-the-art transformer model, our
approach reduces computation by 37% with almost no impact on accuracy, as
measured on two English Question Answering datasets.Comment: Accepted to ACL 2020 (long
HIBERT: Document Level Pre-training of Hierarchical Bidirectional Transformers for Document Summarization
Neural extractive summarization models usually employ a hierarchical encoder
for document encoding and they are trained using sentence-level labels, which
are created heuristically using rule-based methods. Training the hierarchical
encoder with these \emph{inaccurate} labels is challenging. Inspired by the
recent work on pre-training transformer sentence encoders
\cite{devlin:2018:arxiv}, we propose {\sc Hibert} (as shorthand for {\bf
HI}erachical {\bf B}idirectional {\bf E}ncoder {\bf R}epresentations from {\bf
T}ransformers) for document encoding and a method to pre-train it using
unlabeled data. We apply the pre-trained {\sc Hibert} to our summarization
model and it outperforms its randomly initialized counterpart by 1.25 ROUGE on
the CNN/Dailymail dataset and by 2.0 ROUGE on a version of New York Times
dataset. We also achieve the state-of-the-art performance on these two
datasets.Comment: to appear in ACL 201
Generate, Delete and Rewrite: A Three-Stage Framework for Improving Persona Consistency of Dialogue Generation
Maintaining a consistent personality in conversations is quite natural for
human beings, but is still a non-trivial task for machines. The persona-based
dialogue generation task is thus introduced to tackle the
personality-inconsistent problem by incorporating explicit persona text into
dialogue generation models. Despite the success of existing persona-based
models on generating human-like responses, their one-stage decoding framework
can hardly avoid the generation of inconsistent persona words. In this work, we
introduce a three-stage framework that employs a generate-delete-rewrite
mechanism to delete inconsistent words from a generated response prototype and
further rewrite it to a personality-consistent one. We carry out evaluations by
both human and automatic metrics. Experiments on the Persona-Chat dataset show
that our approach achieves good performance.Comment: Accepted by ACL202
The Dialogue Dodecathlon: Open-Domain Knowledge and Image Grounded Conversational Agents
We introduce dodecaDialogue: a set of 12 tasks that measures if a
conversational agent can communicate engagingly with personality and empathy,
ask questions, answer questions by utilizing knowledge resources, discuss
topics and situations, and perceive and converse about images. By multi-tasking
on such a broad large-scale set of data, we hope to both move towards and
measure progress in producing a single unified agent that can perceive, reason
and converse with humans in an open-domain setting. We show that such
multi-tasking improves over a BERT pre-trained baseline, largely due to
multi-tasking with very large dialogue datasets in a similar domain, and that
the multi-tasking in general provides gains to both text and image-based tasks
using several metrics in both the fine-tune and task transfer settings. We
obtain state-of-the-art results on many of the tasks, providing a strong
baseline for this challenge.Comment: ACL 202
- …