45 research outputs found
Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders
While recent neural encoder-decoder models have shown great promise in
modeling open-domain conversations, they often generate dull and generic
responses. Unlike past work that has focused on diversifying the output of the
decoder at word-level to alleviate this problem, we present a novel framework
based on conditional variational autoencoders that captures the discourse-level
diversity in the encoder. Our model uses latent variables to learn a
distribution over potential conversational intents and generates diverse
responses using only greedy decoders. We have further developed a novel variant
that is integrated with linguistic prior knowledge for better performance.
Finally, the training procedure is improved by introducing a bag-of-word loss.
Our proposed models have been validated to generate significantly more diverse
responses than baseline approaches and exhibit competence in discourse-level
decision-making.Comment: Appeared in ACL2017 proceedings as a long paper. Correct a
calculation mistake in Table 1 E-bow & A-bow and results into higher score
An Affect-Rich Neural Conversational Model with Biased Attention and Weighted Cross-Entropy Loss
Affect conveys important implicit information in human communication. Having
the capability to correctly express affect during human-machine conversations
is one of the major milestones in artificial intelligence. In recent years,
extensive research on open-domain neural conversational models has been
conducted. However, embedding affect into such models is still under explored.
In this paper, we propose an end-to-end affect-rich open-domain neural
conversational model that produces responses not only appropriate in syntax and
semantics, but also with rich affect. Our model extends the Seq2Seq model and
adopts VAD (Valence, Arousal and Dominance) affective notations to embed each
word with affects. In addition, our model considers the effect of negators and
intensifiers via a novel affective attention mechanism, which biases attention
towards affect-rich words in input sentences. Lastly, we train our model with
an affect-incorporated objective function to encourage the generation of
affect-rich words in the output responses. Evaluations based on both perplexity
and human evaluations show that our model outperforms the state-of-the-art
baseline model of comparable size in producing natural and affect-rich
responses.Comment: AAAI-1