6,036 research outputs found
Dynamic Schema Graph Fusion Network for Multi-Domain Dialogue State Tracking
Dialogue State Tracking (DST) aims to keep track of users’ intentions during the course of a conversation. In DST, modelling the relations among domains and slots is still an under-studied problem. Existing approaches that have considered such relations generally fall short in: (1) fusing prior slot-domain membership relations and dialogue-aware dynamic slot relations explicitly, and (2) generalizing to unseen domains. To address these issues, we propose a novel Dynamic Schema Graph Fusion Network (DSGFNet), which generates a dynamic schema graph to explicitly fuse the prior slot-domain membership relations and dialogue-aware dynamic slot relations. It also uses the schemata to facilitate knowledge transfer to new domains. DSGFNet consists of a dialogue utterance encoder, a schema graph encoder, a dialogue-aware schema graph evolving network, and a schema graph enhanced dialogue state decoder. Empirical results on benchmark datasets (i.e., SGD, MultiWOZ2.1, and MultiWOZ2.2), show that DSGFNet outperforms existing methods
A Unified Framework for Slot based Response Generation in a Multimodal Dialogue System
Natural Language Understanding (NLU) and Natural Language Generation (NLG)
are the two critical components of every conversational system that handles the
task of understanding the user by capturing the necessary information in the
form of slots and generating an appropriate response in accordance with the
extracted information. Recently, dialogue systems integrated with complementary
information such as images, audio, or video have gained immense popularity. In
this work, we propose an end-to-end framework with the capability to extract
necessary slot values from the utterance and generate a coherent response,
thereby assisting the user to achieve their desired goals in a multimodal
dialogue system having both textual and visual information. The task of
extracting the necessary information is dependent not only on the text but also
on the visual cues present in the dialogue. Similarly, for the generation, the
previous dialog context comprising multimodal information is significant for
providing coherent and informative responses. We employ a multimodal
hierarchical encoder using pre-trained DialoGPT and also exploit the knowledge
base (Kb) to provide a stronger context for both the tasks. Finally, we design
a slot attention mechanism to focus on the necessary information in a given
utterance. Lastly, a decoder generates the corresponding response for the given
dialogue context and the extracted slot values. Experimental results on the
Multimodal Dialogue Dataset (MMD) show that the proposed framework outperforms
the baselines approaches in both the tasks. The code is available at
https://github.com/avinashsai/slot-gpt.Comment: Published in the journal Multimedia Tools and Application
Autoregressive Entity Generation for End-to-End Task-Oriented Dialog
Task-oriented dialog (TOD) systems often require interaction with an external
knowledge base to retrieve necessary entity (e.g., restaurant) information to
support the response generation. Most current end-to-end TOD systems either
retrieve the KB information explicitly or embed it into model parameters for
implicit access.~While the former approach demands scanning the KB at each turn
of response generation, which is inefficient when the KB scales up, the latter
approach shows higher flexibility and efficiency. In either approach, the
systems may generate a response with conflicting entity information. To address
this issue, we propose to generate the entity autoregressively first and
leverage it to guide the response generation in an end-to-end system. To ensure
entity consistency, we impose a trie constraint on entity generation. We also
introduce a logit concatenation strategy to facilitate gradient backpropagation
for end-to-end training. Experiments on MultiWOZ 2.1 single and CAMREST show
that our system can generate more high-quality and entity-consistent responses.Comment: Accepted to COLING 202
- …