3 research outputs found
A Perspective on Objects and Systematic Generalization in Model-Based RL
In order to meet the diverse challenges in solving many real-world problems,
an intelligent agent has to be able to dynamically construct a model of its
environment. Objects facilitate the modular reuse of prior knowledge and the
combinatorial construction of such models. In this work, we argue that
dynamically bound features (objects) do not simply emerge in connectionist
models of the world. We identify several requirements that need to be fulfilled
in overcoming this limitation and highlight corresponding inductive biases.Comment: Accepted to the ICML 2019 workshop on Workshop on Generative Modeling
and Model-Based Reasoning for Robotics and A
R-SQAIR: Relational Sequential Attend, Infer, Repeat
Traditional sequential multi-object attention models rely on a recurrent
mechanism to infer object relations. We propose a relational extension
(R-SQAIR) of one such attention model (SQAIR) by endowing it with a module with
strong relational inductive bias that computes in parallel pairwise
interactions between inferred objects. Two recently proposed relational modules
are studied on tasks of unsupervised learning from videos. We demonstrate gains
over sequential relational mechanisms, also in terms of combinatorial
generalization.Comment: 4 page workshop paper accepted at the NeurIPS 2019 Workshop on
Perception as Generative Reasoning: Structure, Causality, Probabilit
Attention over learned object embeddings enables complex visual reasoning
Neural networks have achieved success in a wide array of perceptual tasks but
often fail at tasks involving both perception and higher-level reasoning. On
these more challenging tasks, bespoke approaches (such as modular symbolic
components, independent dynamics models or semantic parsers) targeted towards
that specific type of task have typically performed better. The downside to
these targeted approaches, however, is that they can be more brittle than
general-purpose neural networks, requiring significant modification or even
redesign according to the particular task at hand. Here, we propose a more
general neural-network-based approach to dynamic visual reasoning problems that
obtains state-of-the-art performance on three different domains, in each case
outperforming bespoke modular approaches tailored specifically to the task. Our
method relies on learned object-centric representations, self-attention and
self-supervised dynamics learning, and all three elements together are required
for strong performance to emerge. The success of this combination suggests that
there may be no need to trade off flexibility for performance on problems
involving spatio-temporal or causal-style reasoning. With the right soft biases
and learning objectives in a neural network we may be able to attain the best
of both worlds.Comment: 22 pages, 5 figure