2 research outputs found
Learning Reasoning Paths over Semantic Graphs for Video-grounded Dialogues
Compared to traditional visual question answering, video-grounded dialogues
require additional reasoning over dialogue context to answer questions in a
multi-turn setting. Previous approaches to video-grounded dialogues mostly use
dialogue context as a simple text input without modelling the inherent
information flows at the turn level. In this paper, we propose a novel
framework of Reasoning Paths in Dialogue Context (PDC). PDC model discovers
information flows among dialogue turns through a semantic graph constructed
based on lexical components in each question and answer. PDC model then learns
to predict reasoning paths over this semantic graph. Our path prediction model
predicts a path from the current turn through past dialogue turns that contain
additional visual cues to answer the current question. Our reasoning model
sequentially processes both visual and textual information through this
reasoning path and the propagated features are used to generate the answer. Our
experimental results demonstrate the effectiveness of our method and provide
additional insights on how models use semantic dependencies in a dialogue
context to retrieve visual cues.Comment: Accepted at ICLR (International Conference on Learning
Representations) 202