181 research outputs found
DiPlomat: A Dialogue Dataset for Situated Pragmatic Reasoning
Pragmatic reasoning plays a pivotal role in deciphering implicit meanings
that frequently arise in real-life conversations and is essential for the
development of communicative social agents. In this paper, we introduce a novel
challenge, DiPlomat, aiming at benchmarking machines' capabilities on pragmatic
reasoning and situated conversational understanding. Compared with previous
works that treat different figurative expressions (e.g. metaphor, sarcasm) as
individual tasks, DiPlomat provides a cohesive framework towards general
pragmatic understanding. Our dataset is created through the utilization of
Amazon Mechanical Turk ( AMT ), resulting in a total of 4, 177 multi-turn
dialogues. In conjunction with the dataset, we propose two tasks, Pragmatic
Identification and Reasoning (PIR) and Conversational Question Answering (CQA).
Experimental results with state-of-the-art (SOTA) neural architectures reveal
several significant findings: 1) large language models ( LLMs) exhibit poor
performance in tackling this subjective domain; 2) comprehensive comprehension
of context emerges as a critical factor for establishing benign human-machine
interactions; 3) current models defect in the application of pragmatic
reasoning. As a result, we call on more attention to improve the ability of
context understanding, reasoning, and implied meaning modeling
MindDial: Belief Dynamics Tracking with Theory-of-Mind Modeling for Situated Neural Dialogue Generation
Humans talk in free-form while negotiating the expressed meanings or common
ground. Despite the impressive conversational abilities of the large generative
language models, they do not consider the individual differences in contextual
understanding in a shared situated environment. In this work, we propose
MindDial, a novel conversational framework that can generate situated free-form
responses to negotiate common ground. We design an explicit mind module that
can track three-level beliefs -- the speaker's belief, the speaker's prediction
of the listener's belief, and the common belief based on the gap between the
first two. Then the speaking act classification head will decide to continue to
talk, end this turn, or take task-related action. We augment a common ground
alignment dataset MutualFriend with belief dynamics annotation, of which the
goal is to find a single mutual friend based on the free chat between two
agents. Experiments show that our model with mental state modeling can resemble
human responses when aligning common ground meanwhile mimic the natural human
conversation flow. The ablation study further validates the third-level common
belief can aggregate information of the first and second-order beliefs and
align common ground more efficiently
- …