We study 3D shape modeling from a single image and make contributions to it
in three aspects. First, we present Pix3D, a large-scale benchmark of diverse
image-shape pairs with pixel-level 2D-3D alignment. Pix3D has wide applications
in shape-related tasks including reconstruction, retrieval, viewpoint
estimation, etc. Building such a large-scale dataset, however, is highly
challenging; existing datasets either contain only synthetic data, or lack
precise alignment between 2D images and 3D shapes, or only have a small number
of images. Second, we calibrate the evaluation criteria for 3D shape
reconstruction through behavioral studies, and use them to objectively and
systematically benchmark cutting-edge reconstruction algorithms on Pix3D.
Third, we design a novel model that simultaneously performs 3D reconstruction
and pose estimation; our multi-task learning approach achieves state-of-the-art
performance on both tasks.Comment: CVPR 2018. The first two authors contributed equally to this work.
Project page: http://pix3d.csail.mit.ed