16,662 research outputs found
A Natural Language Corpus of Common Grounding under Continuous and Partially-Observable Context
Common grounding is the process of creating, repairing and updating mutual
understandings, which is a critical aspect of sophisticated human
communication. However, traditional dialogue systems have limited capability of
establishing common ground, and we also lack task formulations which introduce
natural difficulty in terms of common grounding while enabling easy evaluation
and analysis of complex models. In this paper, we propose a minimal dialogue
task which requires advanced skills of common grounding under continuous and
partially-observable context. Based on this task formulation, we collected a
largescale dataset of 6,760 dialogues which fulfills essential requirements of
natural language corpora. Our analysis of the dataset revealed important
phenomena related to common grounding that need to be considered. Finally, we
evaluate and analyze baseline neural models on a simple subtask that requires
recognition of the created common ground. We show that simple baseline models
perform decently but leave room for further improvement. Overall, we show that
our proposed task will be a fundamental testbed where we can train, evaluate,
and analyze dialogue system's ability for sophisticated common grounding.Comment: AAAI 201
Towards an Indexical Model of Situated Language Comprehension for Cognitive Agents in Physical Worlds
We propose a computational model of situated language comprehension based on
the Indexical Hypothesis that generates meaning representations by translating
amodal linguistic symbols to modal representations of beliefs, knowledge, and
experience external to the linguistic system. This Indexical Model incorporates
multiple information sources, including perceptions, domain knowledge, and
short-term and long-term experiences during comprehension. We show that
exploiting diverse information sources can alleviate ambiguities that arise
from contextual use of underspecific referring expressions and unexpressed
argument alternations of verbs. The model is being used to support linguistic
interactions in Rosie, an agent implemented in Soar that learns from
instruction.Comment: Advances in Cognitive Systems 3 (2014
Do (and say) as I say: Linguistic adaptation in human-computer dialogs
© Theodora Koulouri, Stanislao Lauria, and Robert D. Macredie. This article has been made available through the Brunel Open Access Publishing Fund.There is strong research evidence showing that people naturally align to each otherâs vocabulary, sentence structure, and acoustic features in dialog, yet little is known about how the alignment mechanism operates in the interaction between users and computer systems let alone how it may be exploited to improve the efficiency of the interaction. This article provides an account of lexical alignment in humanâcomputer dialogs, based on empirical data collected in a simulated humanâcomputer interaction scenario. The results indicate that alignment is present, resulting in the gradual reduction and stabilization of the vocabulary-in-use, and that it is also reciprocal. Further, the results suggest that when system and user errors occur, the development of alignment is temporarily disrupted and users tend to introduce novel words to the dialog. The results also indicate that alignment in humanâcomputer interaction may have a strong strategic component and is used as a resource to compensate for less optimal (visually impoverished) interaction conditions. Moreover, lower alignment is associated with less successful interaction, as measured by user perceptions. The article distills the results of the study into design recommendations for humanâcomputer dialog systems and uses them to outline a model of dialog management that supports and exploits alignment through mechanisms for in-use adaptation of the systemâs grammar and lexicon
Resolving References in Visually-Grounded Dialogue via Text Generation
Vision-language models (VLMs) have shown to be effective at image retrieval
based on simple text queries, but text-image retrieval based on conversational
input remains a challenge. Consequently, if we want to use VLMs for reference
resolution in visually-grounded dialogue, the discourse processing capabilities
of these models need to be augmented. To address this issue, we propose
fine-tuning a causal large language model (LLM) to generate definite
descriptions that summarize coreferential information found in the linguistic
context of references. We then use a pretrained VLM to identify referents based
on the generated descriptions, zero-shot. We evaluate our approach on a
manually annotated dataset of visually-grounded dialogues and achieve results
that, on average, exceed the performance of the baselines we compare against.
Furthermore, we find that using referent descriptions based on larger context
windows has the potential to yield higher returns.Comment: Published at SIGDIAL 202
Annotation of negotiation processes in joint-action dialogues
Situated dialogic corpora are invaluable resources for understanding the complex relationship between language, perception, and action as they are based on naturalistic dialogue situations in which the interactants are given shared goals to be accomplished in the real world. In such situations, verbal interactions are intertwined with actions, and shared goals can only be achieved via dynamic negotiation processes based on common ground constructed from discourse history as well as the interactants' knowledge about the status of actions. In this paper, we propose four major dimensions of collaborative tasks that affect the negotiation processes among interactants, and, hence, the structure of the dialogue. Based on a review of available dialogue corpora and annotation manuals, we show that existing annotation schemes so far do not adequately account for the complex dialogue processes in situated task-based scenarios. We illustrate the effects of specific features of a scenario using annotated samples of dialogue taken from the literature as well as our own corpora, and end with a brief discussion of the challenges ahead
An Annotated Corpus of Reference Resolution for Interpreting Common Grounding
Common grounding is the process of creating, repairing and updating mutual
understandings, which is a fundamental aspect of natural language conversation.
However, interpreting the process of common grounding is a challenging task,
especially under continuous and partially-observable context where complex
ambiguity, uncertainty, partial understandings and misunderstandings are
introduced. Interpretation becomes even more challenging when we deal with
dialogue systems which still have limited capability of natural language
understanding and generation. To address this problem, we consider reference
resolution as the central subtask of common grounding and propose a new
resource to study its intermediate process. Based on a simple and general
annotation schema, we collected a total of 40,172 referring expressions in
5,191 dialogues curated from an existing corpus, along with multiple judgements
of referent interpretations. We show that our annotation is highly reliable,
captures the complexity of common grounding through a natural degree of
reasonable disagreements, and allows for more detailed and quantitative
analyses of common grounding strategies. Finally, we demonstrate the advantages
of our annotation for interpreting, analyzing and improving common grounding in
baseline dialogue systems.Comment: 9 pages, 7 figures, 6 tables, Accepted by AAAI 202
- âŠ