4 research outputs found
Cognitive Principles in Robust Multimodal Interpretation
Multimodal conversational interfaces provide a natural means for users to
communicate with computer systems through multiple modalities such as speech
and gesture. To build effective multimodal interfaces, automated interpretation
of user multimodal inputs is important. Inspired by the previous investigation
on cognitive status in multimodal human machine interaction, we have developed
a greedy algorithm for interpreting user referring expressions (i.e.,
multimodal reference resolution). This algorithm incorporates the cognitive
principles of Conversational Implicature and Givenness Hierarchy and applies
constraints from various sources (e.g., temporal, semantic, and contextual) to
resolve references. Our empirical results have shown the advantage of this
algorithm in efficiently resolving a variety of user references. Because of its
simplicity and generality, this approach has the potential to improve the
robustness of multimodal input interpretation
Givenness Hierarchy Theoretic Cognitive Status Filtering
For language-capable interactive robots to be effectively introduced into
human society, they must be able to naturally and efficiently communicate about
the objects, locations, and people found in human environments. An important
aspect of natural language communication is the use of pronouns. Ac-cording to
the linguistic theory of the Givenness Hierarchy(GH), humans use pronouns due
to implicit assumptions about the cognitive statuses their referents have in
the minds of their conversational partners. In previous work, Williams et al.
presented the first computational implementation of the full GH for the purpose
of robot language understanding, leveraging a set of rules informed by the GH
literature. However, that approach was designed specifically for language
understanding,oriented around GH-inspired memory structures used to assess what
entities are candidate referents given a particular cognitive status. In
contrast, language generation requires a model in which cognitive status can be
assessed for a given entity. We present and compare two such models of
cognitive status: a rule-based Finite State Machine model directly informed by
the GH literature and a Cognitive Status Filter designed to more flexibly
handle uncertainty. The models are demonstrated and evaluated using a
silver-standard English subset of the OFAI Multimodal Task Description Corpus.Comment: To be published in the proceedings of the 2020 Annual Meeting of the
Cognitive Science Society (COGSCI). Supplemental materials available at
https://osf.io/qse7y
Cognitive principles in robust multimodal interpretation
Multimodal conversational interfaces provide a natural means for users to communicate with computer systems through multiple modalities such as speech and gesture. To build effective multimodal interfaces, automated interpretation of user multimodal inputs is important. Inspired by the previous investigation on cognitive status in multimodal human machine interaction, we have developed a greedy algorithm for interpreting user referring expressions (i.e., multimodal reference resolution). This algorithm incorporates the cognitive principles of Conversational Implicature and Givenness Hierarchy and applies constraints from various sources (e.g., temporal, semantic, and contextual) to resolve references. Our empirical results have shown the advantage of this algorithm in efficiently resolving a variety of user references. Because of its simplicity and generality, this approach has the potential to improve the robustness of multimodal input interpretation. 1