Search CORE

6,511 research outputs found

Newtonian Image Understanding: Unfolding the Dynamics of Objects in Static Images

Author: Bagherinezhad Hessam
Farhadi Ali
Mottaghi Roozbeh
Rastegari Mohammad
Publication venue
Publication date: 12/11/2015
Field of study

In this paper, we study the challenging problem of predicting the dynamics of objects in static images. Given a query object in an image, our goal is to provide a physical understanding of the object in terms of the forces acting upon it and its long term motion as response to those forces. Direct and explicit estimation of the forces and the motion of objects from a single image is extremely challenging. We define intermediate physical abstractions called Newtonian scenarios and introduce Newtonian Neural Network (

N^3

) that learns to map a single image to a state in a Newtonian scenario. Our experimental evaluations show that our method can reliably predict dynamics of a query object from a single image. In addition, our approach can provide physical reasoning that supports the predicted dynamics in terms of velocity and force vectors. To spur research in this direction we compiled Visual Newtonian Dynamics (VIND) dataset that includes 6806 videos aligned with Newtonian scenarios represented using game engines, and 4516 still images with their ground truth dynamics

arXiv.org e-Print Archive

Crossref

3D-PhysNet: Learning the Intuitive Physics of Non-Rigid Object Deformations

Author: Markham Andrew
Rosa Stefano
Trigoni Niki
Wang Sen
Wang Zhihua
Yang Bo
Publication venue
Publication date: 01/01/2018
Field of study

The ability to interact and understand the environment is a fundamental prerequisite for a wide range of applications from robotics to augmented reality. In particular, predicting how deformable objects will react to applied forces in real time is a significant challenge. This is further confounded by the fact that shape information about encountered objects in the real world is often impaired by occlusions, noise and missing regions e.g. a robot manipulating an object will only be able to observe a partial view of the entire solid. In this work we present a framework, 3D-PhysNet, which is able to predict how a three-dimensional solid will deform under an applied force using intuitive physics modelling. In particular, we propose a new method to encode the physical properties of the material and the applied force, enabling generalisation over materials. The key is to combine deep variational autoencoders with adversarial training, conditioned on the applied force and the material properties. We further propose a cascaded architecture that takes a single 2.5D depth view of the object and predicts its deformation. Training data is provided by a physics simulator. The network is fast enough to be used in real-time applications from partial views. Experimental results show the viability and the generalisation properties of the proposed architecture.Comment: in IJCAI 201

arXiv.org e-Print Archive

Heriot Watt Pure

Crossref

Oxford University Research Archive

Object-Oriented Dynamics Learning through Multi-Level Abstraction

Author: Lin Zichuan
Ren Zhizhou
Wang Jianhao
Zhang Chongjie
Zhu Guangxiang
Publication venue
Publication date: 05/12/2019
Field of study

Object-based approaches for learning action-conditioned dynamics has demonstrated promise for generalization and interpretability. However, existing approaches suffer from structural limitations and optimization difficulties for common environments with multiple dynamic objects. In this paper, we present a novel self-supervised learning framework, called Multi-level Abstraction Object-oriented Predictor (MAOP), which employs a three-level learning architecture that enables efficient object-based dynamics learning from raw visual observations. We also design a spatial-temporal relational reasoning mechanism for MAOP to support instance-level dynamics learning and handle partial observability. Our results show that MAOP significantly outperforms previous methods in terms of sample efficiency and generalization over novel environments for learning environment models. We also demonstrate that learned dynamics models enable efficient planning in unseen environments, comparable to true environment models. In addition, MAOP learns semantically and visually interpretable disentangled representations.Comment: Accepted to the Thirthy-Fourth AAAI Conference On Artificial Intelligence (AAAI), 202

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Unsupervised Discovery of Parts, Structure, and Dynamics

Author: Freeman William T.
Liu Zhijian
Murphy Kevin
Sun Chen
Tenenbaum Joshua B.
Wu Jiajun
Xu Zhenjia
Publication venue
Publication date: 12/03/2019
Field of study

Humans easily recognize object parts and their hierarchical structure by watching how they move; they can then predict how each part moves in the future. In this paper, we propose a novel formulation that simultaneously learns a hierarchical, disentangled object representation and a dynamics model for object parts from unlabeled videos. Our Parts, Structure, and Dynamics (PSD) model learns to, first, recognize the object parts via a layered image representation; second, predict hierarchy via a structural descriptor that composes low-level concepts into a hierarchical structure; and third, model the system dynamics by predicting the future. Experiments on multiple real and synthetic datasets demonstrate that our PSD model works well on all three tasks: segmenting object parts, building their hierarchical structure, and capturing their motion distributions.Comment: ICLR 2019. The first two authors contributed equally to this wor

arXiv.org e-Print Archive

DSpace@MIT