275 research outputs found
Towards an Indexical Model of Situated Language Comprehension for Cognitive Agents in Physical Worlds
We propose a computational model of situated language comprehension based on
the Indexical Hypothesis that generates meaning representations by translating
amodal linguistic symbols to modal representations of beliefs, knowledge, and
experience external to the linguistic system. This Indexical Model incorporates
multiple information sources, including perceptions, domain knowledge, and
short-term and long-term experiences during comprehension. We show that
exploiting diverse information sources can alleviate ambiguities that arise
from contextual use of underspecific referring expressions and unexpressed
argument alternations of verbs. The model is being used to support linguistic
interactions in Rosie, an agent implemented in Soar that learns from
instruction.Comment: Advances in Cognitive Systems 3 (2014
The Role of Perception in Situated Spatial Reference
This position paper set out the argument that an interesting avenue of exploration and study of universals and variation in spatial reference is to address this topic in termsa of the universals in human perception and attention and to explore how these universals impact on spatial reference across cultures and languages
Mind the Gap: Situated Spatial Language a Case-Study in Connecting Perception and Language
This abstract reviews the literature on computational models of spatial semantics and the potential of deep learning models as an useful approach to this challenge
What is not where: the challenge of integrating spatial representations into deep learning architectures
This paper examines to what degree current deep learning architectures for
image caption generation capture spatial language. On the basis of the
evaluation of examples of generated captions from the literature we argue that
systems capture what objects are in the image data but not where these objects
are located: the captions generated by these systems are the output of a
language model conditioned on the output of an object detector that cannot
capture fine-grained location information. Although language models provide
useful knowledge for image captions, we argue that deep learning image
captioning architectures should also model geometric relations between objects.Comment: 15 pages, 10 figures, Appears in CLASP Papers in Computational
Linguistics Vol 1: Proceedings of the Conference on Logic and Machine
Learning in Natural Language (LaML 2017), pp. 41-5
What Is Not Where: the Challenge of Integrating Spatial Representations Into Deep Learning Architectures
This paper examines to what degree current deep learning architectures for image caption generation capture spatial lan- guage. On the basis of the evaluation of examples of generated captions from the literature we argue that systems capture what objects are in the image data but not where these objects are located: the cap- tions generated by these systems are the output of a language model conditioned on the output of an object detector that cannot capture fine-grained location information. Although language models provide useful knowledge for image captions, we argue that deep learning image captioning architectures should also model geometric rela- tions between objects
Investigating the Dimensions of Spatial Language
Spatial prepositions in the English language can be used to denote a vast array of configurations
which greatly diverge from any typical meaning and there is much discussion regarding how their
semantics are shaped and understood. Though there is general agreement that non-geometric aspects
play a significant role in spatial preposition usage, there is a lack of available data providing insight
into how these extra semantic aspects should be modelled. This paper is aimed at facilitating the
acquisition of data that supports theoretical analysis and helps understand the extent to which
different kinds of features play a role in the semantics of spatial prepositions. We first consider key
features of spatial prepositions given in the literature. We then introduce a framework intended
to facilitate the collection of rich data; including geometric, functional and conventional features.
Finally, we describe a preliminary study, concluding with some insights into the difficulties of
modelling spatial prepositions and gathering meaningful data about them
From Verbs to Tasks: An Integrated Account of Learning Tasks from Situated Interactive Instruction.
Intelligent collaborative agents are becoming common in the human society. From virtual assistants such as Siri and Google Now to assistive robots, they contribute to human activities in a variety of ways. As they become more pervasive, the challenge of customizing them to a variety of environments and tasks becomes critical. It is infeasible for engineers to program them for each individual use. Our research aims at building interactive robots and agents that adapt to new environments autonomously by interacting with human users using natural modalities.
This dissertation studies the problem of learning novel tasks from human-agent dialog. We propose a novel approach for interactive task learning, situated interactive instruction (SII), and investigate approaches to three computational challenges that arise in designing SII agents: situated comprehension, mixed-initiative interaction, and interactive task learning. We propose a novel mixed-modality grounded representation for task verbs which encompasses their lexical, semantic, and
task-oriented aspects. This representation is useful in situated comprehension and can be learned through human-agent interactions. We introduce the Indexical Model of comprehension that can exploit
extra-linguistic contexts for resolving semantic ambiguities in situated comprehension of task commands. The Indexical model is integrated with a mixed-initiative interaction model that facilitates
a flexible task-oriented human-agent dialog. This dialog serves as the basis of interactive task learning. We propose an interactive variation of explanation-based learning that can acquire the proposed
representation. We demonstrate that our learning paradigm is efficient, can transfer knowledge between structurally similar tasks, integrates agent-driven exploration with instructional learning, and can acquire several tasks. The methods proposed in this thesis are integrated in Rosie - a generally instructable agent developed in the Soar cognitive architecture and embodied on a table-top robot.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/111573/1/shiwali_1.pd
- …