7 research outputs found
A survey of qualitative spatial representations
Representation and reasoning with qualitative spatial relations is an important problem in artificial intelligence and has wide applications in the fields of geographic information system, computer vision, autonomous robot navigation, natural language understanding, spatial databases and so on. The reasons for this interest in using qualitative spatial relations include cognitive comprehensibility, efficiency and computational facility. This paper summarizes progress in qualitative spatial representation by describing key calculi representing different types of spatial relationships. The paper concludes with a discussion of current research and glimpse of future work
Object-agnostic Affordance Categorization via Unsupervised Learning of Graph Embeddings
Acquiring knowledge about object interactions and affordances can facilitate scene understanding and human-robot collaboration tasks. As humans tend to use objects in many different ways depending on the scene and the objects’ availability, learning object affordances in everyday-life scenarios is a challenging task, particularly in the presence of an open set of interactions and objects. We address the problem of affordance categorization for class-agnostic objects with an open set of interactions; we achieve this by learning similarities between object interactions in an unsupervised way and thus inducing clusters of object affordances. A novel depth-informed qualitative spatial representation is proposed for the construction of Activity Graphs (AGs), which abstract from the continuous representation of spatio-temporal interactions in RGB-D videos. These AGs are clustered to obtain groups of objects with similar affordances. Our experiments in a real-world scenario demonstrate that our method learns to create object affordance clusters with a high V-measure even in cluttered scenes. The proposed approach handles object occlusions by capturing effectively possible interactions and without imposing any object or scene constraints
Tripartite Line Tracks -- Bipartite Line Tracks
Theories of shapes are important for object recognition and for reasoning about the behaviour of objects, both tasks strongly constrained by shape. Whereas the extraction of shape properties has extensively been studied in vision, there is still a lack of qualitative shape descriptions which allow reasoning about shapes with AI techniques in a flexible manner
Categorization of Affordances and Prediction of Future Object Interactions using Qualitative Spatial Relations
The application of deep neural networks on robotic platforms has successfully advanced robot perception in tasks related to human-robot collaboration scenarios. Tasks such as scene understanding, object categorization, affordance detection, interaction anticipation, are facilitated by the acquisition of knowledge about the object interactions taking place in the scene.
The contributions of this thesis are two-fold:
1) it shows how representations of object interactions learned in an unsupervised way can be used to predict categories of objects depending on the affordances;
2) it shows how future frame-independent interaction can be learned in a self-supervised way by exploiting high-level graph representations of the object interactions.
The aim of this research is to create representations and perform predictions of interactions which abstract from the image space and attain generalization across various scenes and objects. Interactions can be static, eg. holding a bottle, as well as dynamic, eg. playing with a ball, where the temporal aspect of the sequence of several static interactions is of importance to make the dynamic interaction distinguishable. Moreover, occlusion of objects in the 2D domain should be handled to avoid false positive interaction detections. Thus, RGB-D video data is exploited for these tasks.
As humans tend to use objects in many different ways depending on the scene and the objects' availability, learning object affordances in everyday-life scenarios is a challenging task, particularly in the presence of an open-set of interactions and class-agnostic objects. In order to abstract from the continuous representation of spatio-temporal interactions in video data, a novel set of high-level qualitative depth-informed spatial relations is presented. Learning similarities via an unsupervised method exploiting graph representations of object interactions induces a hierarchy of clusters of objects with similar affordances. The proposed method handles object occlusions by capturing effectively possible interactions and without imposing any object or scene constraints.
Moreover, interaction and action anticipation remains a challenging problem, especially considering the generalizability constraints of trained models from visual data or exploiting visual video embeddings. State of the art methods allow predictions approximately up to three seconds of time in the future. Hence, most everyday-life activities, which consist of actions of more than five seconds in duration, are not predictable. This thesis presents a novel approach for solving the task of interaction anticipation between objects in a video scene by utilizing high-level qualitative frame-number-independent spatial graphs to represent object interactions. A deep recurrent neural network learns in a self-supervised way to predict graph structures of future object interactions, whilst being decoupled from the visual information, the underlying activity, and the duration of each interaction taking place.
Finally, the proposed methods are evaluated on RGB-D video datasets capturing everyday-life activities of human agents, and are compared against closely-related and state-of-the-art methods
Physical Reasoning for Intelligent Agent in Simulated Environments
Developing Artificial Intelligence (AI) that is capable of
understanding and interacting with the real world in a
sophisticated way has long been a grand vision of AI. There is an
increasing number of AI agents coming into our daily lives and
assisting us with various daily tasks ranging from house cleaning
to serving food in restaurants. While different tasks have
different goals, the domains of the tasks all obey the physical
rules (classic Newtonian physics) of the real world. To
successfully interact with the physical world, an agent needs to
be able to understand its surrounding environment, to predict the
consequences of its actions and to draw plans that can achieve a
goal without causing any unintended outcomes. Much of AI
research over the past decades has been dedicated to specific
sub-problems such as machine learning and computer vision, etc.
Simply plugging in techniques from these subfields is far from
creating a comprehensive AI agent that can work well in a
physical environment. Instead, it requires an integration of
methods from different AI areas that considers specific
conditions and requirements of the physical environment.
In this thesis, we identified several capabilities that are
essential for AI to interact with the physical world, namely,
visual perception, object detection, object tracking, action
selection, and structure planning. As the real world is a highly
complex environment, we started with developing these
capabilities in virtual environments with realistic physics
simulations. The central part of our methods is the combination
of qualitative reasoning and standard techniques from different
AI areas. For the visual perception capability, we developed a
method that can infer spatial properties of rectangular objects
from their minimum bounding rectangles. For the object detection
capability, we developed a method that can detect unknown objects
in a structure by reasoning about the stability of the structure.
For the object tracking capability, we developed a method that
can match perceptually indistinguishable objects in visual
observations made before and after a physical impact. This method
can identify spatial changes of objects in the physical event,
and the result of matching can be used for learning the
consequence of the impact. For the action selection capability,
we developed a method that solves a hole-in-one problem that
requires selecting an action out of an infinite number of actions
with unknown consequences. For the structure planning capability,
we developed a method that can arrange objects to form a stable
and robust structure by reasoning about structural stability and
robustness
Konzeptbasierte Argumentation in dynamischen Umgebungen
Argumentation systems play an important role when controversial points of views are to be considered in order to make decisions on inconsistent data. In this work a scalable framework for argumentation and decision support is outlined. This framework defines basic arguments and conflicts which refer to conceptual descriptions of the given state of affairs. Based on their meaning and preference relations that adopt specific viewpoints, it is possible to determine efficiently successful explanations depending on these viewpoints. We investigate our approach by examining soccer games, since many observed spatiotemporal behaviours in soccer can be interpreted differently