7,468 research outputs found
Better Conversations by Modeling,Filtering,and Optimizing for Coherence and Diversity
We present three enhancements to existing encoder-decoder models for
open-domain conversational agents, aimed at effectively modeling coherence and
promoting output diversity: (1) We introduce a measure of coherence as the
GloVe embedding similarity between the dialogue context and the generated
response, (2) we filter our training corpora based on the measure of coherence
to obtain topically coherent and lexically diverse context-response pairs, (3)
we then train a response generator using a conditional variational autoencoder
model that incorporates the measure of coherence as a latent variable and uses
a context gate to guarantee topical consistency with the context and promote
lexical diversity. Experiments on the OpenSubtitles corpus show a substantial
improvement over competitive neural models in terms of BLEU score as well as
metrics of coherence and diversity
Unsupervised Learning of Style-sensitive Word Vectors
This paper presents the first study aimed at capturing stylistic similarity
between words in an unsupervised manner. We propose extending the continuous
bag of words (CBOW) model (Mikolov et al., 2013) to learn style-sensitive word
vectors using a wider context window under the assumption that the style of all
the words in an utterance is consistent. In addition, we introduce a novel task
to predict lexical stylistic similarity and to create a benchmark dataset for
this task. Our experiment with this dataset supports our assumption and
demonstrates that the proposed extensions contribute to the acquisition of
style-sensitive word embeddings.Comment: 7 pages, Accepted at The 56th Annual Meeting of the Association for
Computational Linguistics (ACL 2018
NEXUS Network: Connecting the Preceding and the Following in Dialogue Generation
Sequence-to-Sequence (seq2seq) models have become overwhelmingly popular in
building end-to-end trainable dialogue systems. Though highly efficient in
learning the backbone of human-computer communications, they suffer from the
problem of strongly favoring short generic responses. In this paper, we argue
that a good response should smoothly connect both the preceding dialogue
history and the following conversations. We strengthen this connection through
mutual information maximization. To sidestep the non-differentiability of
discrete natural language tokens, we introduce an auxiliary continuous code
space and map such code space to a learnable prior distribution for generation
purpose. Experiments on two dialogue datasets validate the effectiveness of our
model, where the generated responses are closely related to the dialogue
context and lead to more interactive conversations.Comment: Accepted by EMNLP201
- …