1,109 research outputs found
NASA: Neural Articulated Shape Approximation
Efficient representation of articulated objects such as human bodies is an
important problem in computer vision and graphics. To efficiently simulate
deformation, existing approaches represent 3D objects using polygonal meshes
and deform them using skinning techniques. This paper introduces neural
articulated shape approximation (NASA), an alternative framework that enables
efficient representation of articulated deformable objects using neural
indicator functions that are conditioned on pose. Occupancy testing using NASA
is straightforward, circumventing the complexity of meshes and the issue of
water-tightness. We demonstrate the effectiveness of NASA for 3D tracking
applications, and discuss other potential extensions.Comment: ECCV 202
FML: Face Model Learning from Videos
Monocular image-based 3D reconstruction of faces is a long-standing problem
in computer vision. Since image data is a 2D projection of a 3D face, the
resulting depth ambiguity makes the problem ill-posed. Most existing methods
rely on data-driven priors that are built from limited 3D face scans. In
contrast, we propose multi-frame video-based self-supervised training of a deep
network that (i) learns a face identity model both in shape and appearance
while (ii) jointly learning to reconstruct 3D faces. Our face model is learned
using only corpora of in-the-wild video clips collected from the Internet. This
virtually endless source of training data enables learning of a highly general
3D face model. In order to achieve this, we propose a novel multi-frame
consistency loss that ensures consistent shape and appearance across multiple
frames of a subject's face, thus minimizing depth ambiguity. At test time we
can use an arbitrary number of frames, so that we can perform both monocular as
well as multi-frame reconstruction.Comment: CVPR 2019 (Oral). Video: https://www.youtube.com/watch?v=SG2BwxCw0lQ,
Project Page: https://gvv.mpi-inf.mpg.de/projects/FML19
Non-rigid Reconstruction with a Single Moving RGB-D Camera
We present a novel non-rigid reconstruction method using a moving RGB-D
camera. Current approaches use only non-rigid part of the scene and completely
ignore the rigid background. Non-rigid parts often lack sufficient geometric
and photometric information for tracking large frame-to-frame motion. Our
approach uses camera pose estimated from the rigid background for foreground
tracking. This enables robust foreground tracking in situations where large
frame-to-frame motion occurs. Moreover, we are proposing a multi-scale
deformation graph which improves non-rigid tracking without compromising the
quality of the reconstruction. We are also contributing a synthetic dataset
which is made publically available for evaluating non-rigid reconstruction
methods. The dataset provides frame-by-frame ground truth geometry of the
scene, the camera trajectory, and masks for background foreground. Experimental
results show that our approach is more robust in handling larger frame-to-frame
motions and provides better reconstruction compared to state-of-the-art
approaches.Comment: Accepted in International Conference on Pattern Recognition (ICPR
2018
Common Pets in 3D: Dynamic New-View Synthesis of Real-Life Deformable Categories
Obtaining photorealistic reconstructions of objects from sparse views is
inherently ambiguous and can only be achieved by learning suitable
reconstruction priors. Earlier works on sparse rigid object reconstruction
successfully learned such priors from large datasets such as CO3D. In this
paper, we extend this approach to dynamic objects. We use cats and dogs as a
representative example and introduce Common Pets in 3D (CoP3D), a collection of
crowd-sourced videos showing around 4,200 distinct pets. CoP3D is one of the
first large-scale datasets for benchmarking non-rigid 3D reconstruction "in the
wild". We also propose Tracker-NeRF, a method for learning 4D reconstruction
from our dataset. At test time, given a small number of video frames of an
unseen object, Tracker-NeRF predicts the trajectories of its 3D points and
generates new views, interpolating viewpoint and time. Results on CoP3D reveal
significantly better non-rigid new-view synthesis performance than existing
baselines
- …