1,649,854 research outputs found

    Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders

    Full text link
    While recent neural encoder-decoder models have shown great promise in modeling open-domain conversations, they often generate dull and generic responses. Unlike past work that has focused on diversifying the output of the decoder at word-level to alleviate this problem, we present a novel framework based on conditional variational autoencoders that captures the discourse-level diversity in the encoder. Our model uses latent variables to learn a distribution over potential conversational intents and generates diverse responses using only greedy decoders. We have further developed a novel variant that is integrated with linguistic prior knowledge for better performance. Finally, the training procedure is improved by introducing a bag-of-word loss. Our proposed models have been validated to generate significantly more diverse responses than baseline approaches and exhibit competence in discourse-level decision-making.Comment: Appeared in ACL2017 proceedings as a long paper. Correct a calculation mistake in Table 1 E-bow & A-bow and results into higher score

    On the lower tail variational problem for random graphs

    Full text link
    We study the lower tail large deviation problem for subgraph counts in a random graph. Let XHX_H denote the number of copies of HH in an Erd\H{o}s-R\'enyi random graph G(n,p)\mathcal{G}(n,p). We are interested in estimating the lower tail probability P(XH(1δ)EXH)\mathbb{P}(X_H \le (1-\delta) \mathbb{E} X_H) for fixed 0<δ<10 < \delta < 1. Thanks to the results of Chatterjee, Dembo, and Varadhan, this large deviation problem has been reduced to a natural variational problem over graphons, at least for pnαHp \ge n^{-\alpha_H} (and conjecturally for a larger range of pp). We study this variational problem and provide a partial characterization of the so-called "replica symmetric" phase. Informally, our main result says that for every HH, and 0<δ<δH0 < \delta < \delta_H for some δH>0\delta_H > 0, as p0p \to 0 slowly, the main contribution to the lower tail probability comes from Erd\H{o}s-R\'enyi random graphs with a uniformly tilted edge density. On the other hand, this is false for non-bipartite HH and δ\delta close to 1.Comment: 15 pages, 5 figures, 1 tabl
    corecore