65 research outputs found
ShapeStacks: Learning Vision-Based Physical Intuition for Generalised Object Stacking
Physical intuition is pivotal for intelligent agents to perform complex
tasks. In this paper we investigate the passive acquisition of an intuitive
understanding of physical principles as well as the active utilisation of this
intuition in the context of generalised object stacking. To this end, we
provide: a simulation-based dataset featuring 20,000 stack configurations
composed of a variety of elementary geometric primitives richly annotated
regarding semantics and structural stability. We train visual classifiers for
binary stability prediction on the ShapeStacks data and scrutinise their
learned physical intuition. Due to the richness of the training data our
approach also generalises favourably to real-world scenarios achieving
state-of-the-art stability prediction on a publicly available benchmark of
block towers. We then leverage the physical intuition learned by our model to
actively construct stable stacks and observe the emergence of an intuitive
notion of stackability - an inherent object affordance - induced by the active
stacking task. Our approach performs well even in challenging conditions where
it considerably exceeds the stack height observed during training or in cases
where initially unstable structures must be stabilised via counterbalancing.Comment: revised version to appear at ECCV 201
Stochastic Prediction of Multi-Agent Interactions from Partial Observations
We present a method that learns to integrate temporal information, from a
learned dynamics model, with ambiguous visual information, from a learned
vision model, in the context of interacting agents. Our method is based on a
graph-structured variational recurrent neural network (Graph-VRNN), which is
trained end-to-end to infer the current state of the (partially observed)
world, as well as to forecast future states. We show that our method
outperforms various baselines on two sports datasets, one based on real
basketball trajectories, and one generated by a soccer game engine.Comment: ICLR 2019 camera read
Unsupervised Intuitive Physics from Visual Observations
While learning models of intuitive physics is an increasingly active area of
research, current approaches still fall short of natural intelligences in one
important regard: they require external supervision, such as explicit access to
physical states, at training and sometimes even at test times. Some authors
have relaxed such requirements by supplementing the model with an handcrafted
physical simulator. Still, the resulting methods are unable to automatically
learn new complex environments and to understand physical interactions within
them. In this work, we demonstrated for the first time learning such predictors
directly from raw visual observations and without relying on simulators. We do
so in two steps: first, we learn to track mechanically-salient objects in
videos using causality and equivariance, two unsupervised learning principles
that do not require auto-encoding. Second, we demonstrate that the extracted
positions are sufficient to successfully train visual motion predictors that
can take the underlying environment into account. We validate our predictors on
synthetic datasets; then, we introduce a new dataset, ROLL4REAL, consisting of
real objects rolling on complex terrains (pool table, elliptical bowl, and
random height-field). We show that in all such cases it is possible to learn
reliable extrapolators of the object trajectories from raw videos alone,
without any form of external supervision and with no more prior knowledge than
the choice of a convolutional neural network architecture
Learning Particle Dynamics for Manipulating Rigid Bodies, Deformable Objects, and Fluids
Real-life control tasks involve matters of various substances---rigid or soft
bodies, liquid, gas---each with distinct physical behaviors. This poses
challenges to traditional rigid-body physics engines. Particle-based simulators
have been developed to model the dynamics of these complex scenes; however,
relying on approximation techniques, their simulation often deviates from
real-world physics, especially in the long term. In this paper, we propose to
learn a particle-based simulator for complex control tasks. Combining learning
with particle-based systems brings in two major benefits: first, the learned
simulator, just like other particle-based systems, acts widely on objects of
different materials; second, the particle-based representation poses strong
inductive bias for learning: particles of the same type have the same dynamics
within. This enables the model to quickly adapt to new environments of unknown
dynamics within a few observations. We demonstrate robots achieving complex
manipulation tasks using the learned simulator, such as manipulating fluids and
deformable foam, with experiments both in simulation and in the real world. Our
study helps lay the foundation for robot learning of dynamic scenes with
particle-based representations.Comment: Accepted to ICLR 2019. Project Page: http://dpi.csail.mit.edu Video:
https://www.youtube.com/watch?v=FrPpP7aW3L
- …