Search CORE

1,596 research outputs found

NEXUS Network: Connecting the Preceding and the Following in Dialogue Generation

Author: Klakow Dietrich
Li Wenjie
Shen Xiaoyu
Su Hui
Publication venue
Publication date: 01/01/2018
Field of study

Sequence-to-Sequence (seq2seq) models have become overwhelmingly popular in building end-to-end trainable dialogue systems. Though highly efficient in learning the backbone of human-computer communications, they suffer from the problem of strongly favoring short generic responses. In this paper, we argue that a good response should smoothly connect both the preceding dialogue history and the following conversations. We strengthen this connection through mutual information maximization. To sidestep the non-differentiability of discrete natural language tokens, we introduce an auxiliary continuous code space and map such code space to a learnable prior distribution for generation purpose. Experiments on two dialogue datasets validate the effectiveness of our model, where the generated responses are closely related to the dialogue context and lead to more interactive conversations.Comment: Accepted by EMNLP201

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders

Author: Eskenazi Maxine
Zhao Ran
Zhao Tiancheng
Publication venue
Publication date: 01/01/2017
Field of study

While recent neural encoder-decoder models have shown great promise in modeling open-domain conversations, they often generate dull and generic responses. Unlike past work that has focused on diversifying the output of the decoder at word-level to alleviate this problem, we present a novel framework based on conditional variational autoencoders that captures the discourse-level diversity in the encoder. Our model uses latent variables to learn a distribution over potential conversational intents and generates diverse responses using only greedy decoders. We have further developed a novel variant that is integrated with linguistic prior knowledge for better performance. Finally, the training procedure is improved by introducing a bag-of-word loss. Our proposed models have been validated to generate significantly more diverse responses than baseline approaches and exhibit competence in discourse-level decision-making.Comment: Appeared in ACL2017 proceedings as a long paper. Correct a calculation mistake in Table 1 E-bow & A-bow and results into higher score

arXiv.org e-Print Archive

Crossref