1,013 research outputs found
Unsupervised Discovery of Parts, Structure, and Dynamics
Humans easily recognize object parts and their hierarchical structure by
watching how they move; they can then predict how each part moves in the
future. In this paper, we propose a novel formulation that simultaneously
learns a hierarchical, disentangled object representation and a dynamics model
for object parts from unlabeled videos. Our Parts, Structure, and Dynamics
(PSD) model learns to, first, recognize the object parts via a layered image
representation; second, predict hierarchy via a structural descriptor that
composes low-level concepts into a hierarchical structure; and third, model the
system dynamics by predicting the future. Experiments on multiple real and
synthetic datasets demonstrate that our PSD model works well on all three
tasks: segmenting object parts, building their hierarchical structure, and
capturing their motion distributions.Comment: ICLR 2019. The first two authors contributed equally to this wor
Motion Textures: Modeling, Classification, and Segmentation Using Mixed-State Markov Random Fields
published_or_final_versio
A survey on 2d object tracking in digital video
This paper presents object tracking methods in video.Different algorithms based on rigid, non rigid and articulated object tracking are studied. The goal of this article is to review the state-of-the-art tracking methods, classify them
into different categories, and identify new trends.It is often the case that tracking objects in consecutive frames is supported by a prediction scheme. Based on information extracted from previous frames and any high level information that can be obtained, the state (location) of the
object is predicted.An excellent framework for prediction is kalman filter, which additionally estimates prediction error.In complex scenes, instead of single hypothesis, multiple hypotheses using Particle filter can be used.Different
techniques are given for different types of constraints in video
Unsupervised Learning of Lagrangian Dynamics from Images for Prediction and Control
Recent approaches for modelling dynamics of physical systems with neural
networks enforce Lagrangian or Hamiltonian structure to improve prediction and
generalization. However, these approaches fail to handle the case when
coordinates are embedded in high-dimensional data such as images. We introduce
a new unsupervised neural network model that learns Lagrangian dynamics from
images, with interpretability that benefits prediction and control. The model
infers Lagrangian dynamics on generalized coordinates that are simultaneously
learned with a coordinate-aware variational autoencoder (VAE). The VAE is
designed to account for the geometry of physical systems composed of multiple
rigid bodies in the plane. By inferring interpretable Lagrangian dynamics, the
model learns physical system properties, such as kinetic and potential energy,
which enables long-term prediction of dynamics in the image space and synthesis
of energy-based controllers
- …