Search CORE

5,427 research outputs found

X-ReCoSa: Multi-Scale Context Aggregation For Multi-Turn Dialogue Generation

Author: Wu Danqin
Publication venue
Publication date: 14/03/2023
Field of study

In multi-turn dialogue generation, responses are not only related to the topic and background of the context but also related to words and phrases in the sentences of the context. However, currently widely used hierarchical dialog models solely rely on context representations from the utterance-level encoder, ignoring the sentence representations output by the word-level encoder. This inevitably results in a loss of information while decoding and generating. In this paper, we propose a new dialog model X-ReCoSa to tackle this problem which aggregates multi-scale context information for hierarchical dialog models. Specifically, we divide the generation decoder into upper and lower parts, namely the intention part and the generation part. Firstly, the intention part takes context representations as input to generate the intention of the response. Then the generation part generates words depending on sentence representations. Therefore, the hierarchical information has been fused into response generation. we conduct experiments on the English dataset DailyDialog. Experimental results exhibit that our method outperforms baseline models on both automatic metric-based and human-based evaluations

arXiv.org e-Print Archive

NEXUS Network: Connecting the Preceding and the Following in Dialogue Generation

Author: Klakow Dietrich
Li Wenjie
Shen Xiaoyu
Su Hui
Publication venue
Publication date: 01/01/2018
Field of study

Sequence-to-Sequence (seq2seq) models have become overwhelmingly popular in building end-to-end trainable dialogue systems. Though highly efficient in learning the backbone of human-computer communications, they suffer from the problem of strongly favoring short generic responses. In this paper, we argue that a good response should smoothly connect both the preceding dialogue history and the following conversations. We strengthen this connection through mutual information maximization. To sidestep the non-differentiability of discrete natural language tokens, we introduce an auxiliary continuous code space and map such code space to a learnable prior distribution for generation purpose. Experiments on two dialogue datasets validate the effectiveness of our model, where the generated responses are closely related to the dialogue context and lead to more interactive conversations.Comment: Accepted by EMNLP201

arXiv.org e-Print Archive

Crossref

MPG.PuRe