43 research outputs found
Improving Variational Encoder-Decoders in Dialogue Generation
Variational encoder-decoders (VEDs) have shown promising results in dialogue
generation. However, the latent variable distributions are usually approximated
by a much simpler model than the powerful RNN structure used for encoding and
decoding, yielding the KL-vanishing problem and inconsistent training
objective. In this paper, we separate the training step into two phases: The
first phase learns to autoencode discrete texts into continuous embeddings,
from which the second phase learns to generalize latent representations by
reconstructing the encoded embedding. In this case, latent variables are
sampled by transforming Gaussian noise through multi-layer perceptrons and are
trained with a separate VED model, which has the potential of realizing a much
more flexible distribution. We compare our model with current popular models
and the experiment demonstrates substantial improvement in both metric-based
and human evaluations.Comment: Accepted by AAAI201
NEXUS Network: Connecting the Preceding and the Following in Dialogue Generation
Sequence-to-Sequence (seq2seq) models have become overwhelmingly popular in
building end-to-end trainable dialogue systems. Though highly efficient in
learning the backbone of human-computer communications, they suffer from the
problem of strongly favoring short generic responses. In this paper, we argue
that a good response should smoothly connect both the preceding dialogue
history and the following conversations. We strengthen this connection through
mutual information maximization. To sidestep the non-differentiability of
discrete natural language tokens, we introduce an auxiliary continuous code
space and map such code space to a learnable prior distribution for generation
purpose. Experiments on two dialogue datasets validate the effectiveness of our
model, where the generated responses are closely related to the dialogue
context and lead to more interactive conversations.Comment: Accepted by EMNLP201