29,295 research outputs found
Pix3D: Dataset and Methods for Single-Image 3D Shape Modeling
We study 3D shape modeling from a single image and make contributions to it
in three aspects. First, we present Pix3D, a large-scale benchmark of diverse
image-shape pairs with pixel-level 2D-3D alignment. Pix3D has wide applications
in shape-related tasks including reconstruction, retrieval, viewpoint
estimation, etc. Building such a large-scale dataset, however, is highly
challenging; existing datasets either contain only synthetic data, or lack
precise alignment between 2D images and 3D shapes, or only have a small number
of images. Second, we calibrate the evaluation criteria for 3D shape
reconstruction through behavioral studies, and use them to objectively and
systematically benchmark cutting-edge reconstruction algorithms on Pix3D.
Third, we design a novel model that simultaneously performs 3D reconstruction
and pose estimation; our multi-task learning approach achieves state-of-the-art
performance on both tasks.Comment: CVPR 2018. The first two authors contributed equally to this work.
Project page: http://pix3d.csail.mit.ed
From 3D Point Clouds to Pose-Normalised Depth Maps
We consider the problem of generating either pairwise-aligned or pose-normalised depth maps from noisy 3D point clouds in a relatively unrestricted poses. Our system is deployed in a 3D face alignment application and consists of the following four stages: (i) data filtering, (ii) nose tip identification and sub-vertex localisation, (iii) computation of the (relative) face orientation, (iv) generation of either a pose aligned or a pose normalised depth map. We generate an implicit radial basis function (RBF) model of the facial surface and this is employed within all four stages of the process. For example, in stage (ii), construction of novel invariant features is based on sampling this RBF over a set of concentric spheres to give a spherically-sampled RBF (SSR) shape histogram. In stage (iii), a second novel descriptor, called an isoradius contour curvature signal, is defined, which allows rotational alignment to be determined using a simple process of 1D correlation. We test our system on both the University of York (UoY) 3D face dataset and the Face Recognition Grand Challenge (FRGC) 3D data. For the more challenging UoY data, our SSR descriptors significantly outperform three variants of spin images, successfully identifying nose vertices at a rate of 99.6%. Nose localisation performance on the higher quality FRGC data, which has only small pose variations, is 99.9%. Our best system successfully normalises the pose of 3D faces at rates of 99.1% (UoY data) and 99.6% (FRGC data)
- …