20,229 research outputs found
DeepContext: Context-Encoding Neural Pathways for 3D Holistic Scene Understanding
While deep neural networks have led to human-level performance on computer
vision tasks, they have yet to demonstrate similar gains for holistic scene
understanding. In particular, 3D context has been shown to be an extremely
important cue for scene understanding - yet very little research has been done
on integrating context information with deep models. This paper presents an
approach to embed 3D context into the topology of a neural network trained to
perform holistic scene understanding. Given a depth image depicting a 3D scene,
our network aligns the observed scene with a predefined 3D scene template, and
then reasons about the existence and location of each object within the scene
template. In doing so, our model recognizes multiple objects in a single
forward pass of a 3D convolutional neural network, capturing both global scene
and local object information simultaneously. To create training data for this
3D network, we generate partly hallucinated depth images which are rendered by
replacing real objects with a repository of CAD models of the same object
category. Extensive experiments demonstrate the effectiveness of our algorithm
compared to the state-of-the-arts. Source code and data are available at
http://deepcontext.cs.princeton.edu.Comment: Accepted by ICCV201
CGIntrinsics: Better Intrinsic Image Decomposition through Physically-Based Rendering
Intrinsic image decomposition is a challenging, long-standing computer vision
problem for which ground truth data is very difficult to acquire. We explore
the use of synthetic data for training CNN-based intrinsic image decomposition
models, then applying these learned models to real-world images. To that end,
we present \ICG, a new, large-scale dataset of physically-based rendered images
of scenes with full ground truth decompositions. The rendering process we use
is carefully designed to yield high-quality, realistic images, which we find to
be crucial for this problem domain. We also propose a new end-to-end training
method that learns better decompositions by leveraging \ICG, and optionally IIW
and SAW, two recent datasets of sparse annotations on real-world images.
Surprisingly, we find that a decomposition network trained solely on our
synthetic data outperforms the state-of-the-art on both IIW and SAW, and
performance improves even further when IIW and SAW data is added during
training. Our work demonstrates the suprising effectiveness of
carefully-rendered synthetic data for the intrinsic images task.Comment: Paper for 'CGIntrinsics: Better Intrinsic Image Decomposition through
Physically-Based Rendering' published in ECCV, 201
Physical Primitive Decomposition
Objects are made of parts, each with distinct geometry, physics,
functionality, and affordances. Developing such a distributed, physical,
interpretable representation of objects will facilitate intelligent agents to
better explore and interact with the world. In this paper, we study physical
primitive decomposition---understanding an object through its components, each
with physical and geometric attributes. As annotated data for object parts and
physics are rare, we propose a novel formulation that learns physical
primitives by explaining both an object's appearance and its behaviors in
physical events. Our model performs well on block towers and tools in both
synthetic and real scenarios; we also demonstrate that visual and physical
observations often provide complementary signals. We further present ablation
and behavioral studies to better understand our model and contrast it with
human performance.Comment: ECCV 2018. Project page: http://ppd.csail.mit.edu
- …