3 research outputs found
Tracking perceptually indistinguishable objects using spatial reasoning
Intelligent agents perceive the world mainly through images captured at different time points. Being able to track objects from one image to another is fundamental for understanding the changes of the world. Tracking becomes challenging when there are multiple perceptually indistinguishable objects (PIOs), i.e., objects that have the same appearance and cannot be visually distinguished. Then it is necessary to reidentify all PIOs whenever a new observation is made. In this paper we consider the case where changes of the world were caused by a single physical event and where matches between PIOs of subsequent observations must be consistent with the effects of the physical event. We present a solution to this problem based on qualitative spatial representation and reasoning. It can improve tracking accuracy significantly by qualitatively predicting possible motions of objects and discarding matches that violate spatial and physical constraints. We evaluate our solution in a real video gaming scenario
Physical Reasoning for Intelligent Agent in Simulated Environments
Developing Artificial Intelligence (AI) that is capable of
understanding and interacting with the real world in a
sophisticated way has long been a grand vision of AI. There is an
increasing number of AI agents coming into our daily lives and
assisting us with various daily tasks ranging from house cleaning
to serving food in restaurants. While different tasks have
different goals, the domains of the tasks all obey the physical
rules (classic Newtonian physics) of the real world. To
successfully interact with the physical world, an agent needs to
be able to understand its surrounding environment, to predict the
consequences of its actions and to draw plans that can achieve a
goal without causing any unintended outcomes. Much of AI
research over the past decades has been dedicated to specific
sub-problems such as machine learning and computer vision, etc.
Simply plugging in techniques from these subfields is far from
creating a comprehensive AI agent that can work well in a
physical environment. Instead, it requires an integration of
methods from different AI areas that considers specific
conditions and requirements of the physical environment.
In this thesis, we identified several capabilities that are
essential for AI to interact with the physical world, namely,
visual perception, object detection, object tracking, action
selection, and structure planning. As the real world is a highly
complex environment, we started with developing these
capabilities in virtual environments with realistic physics
simulations. The central part of our methods is the combination
of qualitative reasoning and standard techniques from different
AI areas. For the visual perception capability, we developed a
method that can infer spatial properties of rectangular objects
from their minimum bounding rectangles. For the object detection
capability, we developed a method that can detect unknown objects
in a structure by reasoning about the stability of the structure.
For the object tracking capability, we developed a method that
can match perceptually indistinguishable objects in visual
observations made before and after a physical impact. This method
can identify spatial changes of objects in the physical event,
and the result of matching can be used for learning the
consequence of the impact. For the action selection capability,
we developed a method that solves a hole-in-one problem that
requires selecting an action out of an infinite number of actions
with unknown consequences. For the structure planning capability,
we developed a method that can arrange objects to form a stable
and robust structure by reasoning about structural stability and
robustness