4 research outputs found
Discovering Predictive Relational Object Symbols with Symbolic Attentive Layers
In this paper, we propose and realize a new deep learning architecture for
discovering symbolic representations for objects and their relations based on
the self-supervised continuous interaction of a manipulator robot with multiple
objects on a tabletop environment. The key feature of the model is that it can
handle a changing number number of objects naturally and map the object-object
relations into symbolic domain explicitly. In the model, we employ a
self-attention layer that computes discrete attention weights from object
features, which are treated as relational symbols between objects. These
relational symbols are then used to aggregate the learned object symbols and
predict the effects of executed actions on each object. The result is a
pipeline that allows the formation of object symbols and relational symbols
from a dataset of object features, actions, and effects in an end-to-end
manner. We compare the performance of our proposed architecture with
state-of-the-art symbol discovery methods in a simulated tabletop environment
where the robot needs to discover symbols related to the relative positions of
objects to predict the observed effect successfully. Our experiments show that
the proposed architecture performs better than other baselines in effect
prediction while forming not only object symbols but also relational symbols.
Furthermore, we analyze the learned symbols and relational patterns between
objects to learn about how the model interprets the environment. Our analysis
shows that the learned symbols relate to the relative positions of objects,
object types, and their horizontal alignment on the table, which reflect the
regularities in the environment.Comment: arXiv admin note: text overlap with arXiv:2208.0102
Symbolic Manipulation Planning with Discovered Object and Relational Predicates
Discovering the symbols and rules that can be used in long-horizon planning
from a robot's unsupervised exploration of its environment and continuous
sensorimotor experience is a challenging task. The previous studies proposed
learning symbols from single or paired object interactions and planning with
these symbols. In this work, we propose a system that learns rules with
discovered object and relational symbols that encode an arbitrary number of
objects and the relations between them, converts those rules to Planning Domain
Description Language (PDDL), and generates plans that involve affordances of
the arbitrary number of objects to achieve tasks. We validated our system with
box-shaped objects in different sizes and showed that the system can develop a
symbolic knowledge of pick-up, carry, and place operations, taking into account
object compounds in different configurations, such as boxes would be carried
together with a larger box that they are placed on. We also compared our method
with the state-of-the-art methods and showed that planning with the operators
defined over relational symbols gives better planning performance compared to
the baselines
DeepSym: Deep Symbol Generation and Rule Learning from Unsupervised Continuous Robot Interaction for Planning
Autonomous discovery of discrete symbols and rules from continuous
interaction experience is a crucial building block of robot AI, but remains a
challenging problem. Solving it will overcome the limitations in scalability,
flexibility, and robustness of manually-designed symbols and rules, and will
constitute a substantial advance towards autonomous robots that can learn and
reason at abstract levels in open-ended environments. Towards this goal, we
propose a novel and general method that finds action-grounded, discrete object
and effect categories and builds probabilistic rules over them that can be used
in complex action planning. Our robot interacts with single and multiple
objects using a given action repertoire and observes the effects created in the
environment. In order to form action-grounded object, effect, and relational
categories, we employ a binarized bottleneck layer of a predictive, deep
encoder-decoder network that takes as input the image of the scene and the
action applied, and generates the resulting object displacements in the scene
(action effects) in pixel coordinates. The binary latent vector represents a
learned, action-driven categorization of objects. To distill the knowledge
represented by the neural network into rules useful for symbolic reasoning, we
train a decision tree to reproduce its decoder function. From its branches we
extract probabilistic rules and represent them in PPDDL, allowing off-the-shelf
planners to operate on the robot's sensorimotor experience. Our system is
verified in a physics-based 3d simulation environment where a robot arm-hand
system learned symbols that can be interpreted as 'rollable', 'insertable',
'larger-than' from its push and stack actions; and generated effective plans to
achieve goals such as building towers from given cubes, balls, and cups using
off-the-shelf probabilistic planners