Search CORE

61 research outputs found

Learning Shape Priors for Single-View 3D Completion and Reconstruction

Author: BK Horn
CB Choy
J Johnson
JT Barron
Jun-Yan Zhu
M Kazhdan
M Sung
Maxim Tatarchenko
Nathan Silberman
NJ Mitra
Q Huang
R Girdhar
R Zhang
S Bell
Y Li
Yu Xiang
Publication venue
Publication date: 13/09/2018
Field of study

The problem of single-view 3D shape completion or reconstruction is challenging, because among the many possible shapes that explain an observation, most are implausible and do not correspond to natural objects. Recent research in the field has tackled this problem by exploiting the expressiveness of deep convolutional networks. In fact, there is another level of ambiguity that is often overlooked: among plausible shapes, there are still multiple shapes that fit the 2D image equally well; i.e., the ground truth shape is non-deterministic given a single-view input. Existing fully supervised approaches fail to address this issue, and often produce blurry mean shapes with smooth surfaces but no fine details. In this paper, we propose ShapeHD, pushing the limit of single-view shape completion and reconstruction by integrating deep generative models with adversarially learned shape priors. The learned priors serve as a regularizer, penalizing the model only if its output is unrealistic, not if it deviates from the ground truth. Our design thus overcomes both levels of ambiguity aforementioned. Experiments demonstrate that ShapeHD outperforms state of the art by a large margin in both shape completion and shape reconstruction on multiple real datasets.Comment: ECCV 2018. The first two authors contributed equally to this work. Project page: http://shapehd.csail.mit.edu

arXiv.org e-Print Archive

Crossref

DSpace@MIT

Pix3D: Dataset and Methods for Single-Image 3D Shape Modeling

Author: Freeman William T.
Sun Xingyuan
Tenenbaum Joshua B.
Wu Jiajun
Xue Tianfan
Zhang Chengkai
Zhang Xiuming
Zhang Zhoutong
Publication venue
Publication date: 12/04/2018
Field of study

We study 3D shape modeling from a single image and make contributions to it in three aspects. First, we present Pix3D, a large-scale benchmark of diverse image-shape pairs with pixel-level 2D-3D alignment. Pix3D has wide applications in shape-related tasks including reconstruction, retrieval, viewpoint estimation, etc. Building such a large-scale dataset, however, is highly challenging; existing datasets either contain only synthetic data, or lack precise alignment between 2D images and 3D shapes, or only have a small number of images. Second, we calibrate the evaluation criteria for 3D shape reconstruction through behavioral studies, and use them to objectively and systematically benchmark cutting-edge reconstruction algorithms on Pix3D. Third, we design a novel model that simultaneously performs 3D reconstruction and pose estimation; our multi-task learning approach achieves state-of-the-art performance on both tasks.Comment: CVPR 2018. The first two authors contributed equally to this work. Project page: http://pix3d.csail.mit.ed

arXiv.org e-Print Archive

Crossref

DSpace@MIT

MVTN: Learning Multi-View Transformations for 3D Understanding

Author: AlZahrani Faisal
Ghanem Bernard
Giancola Silvio
Hamdi Abdullah
Publication venue
Publication date: 27/12/2022
Field of study

Multi-view projection techniques have shown themselves to be highly effective in achieving top-performing results in the recognition of 3D shapes. These methods involve learning how to combine information from multiple view-points. However, the camera view-points from which these views are obtained are often fixed for all shapes. To overcome the static nature of current multi-view techniques, we propose learning these view-points. Specifically, we introduce the Multi-View Transformation Network (MVTN), which uses differentiable rendering to determine optimal view-points for 3D shape recognition. As a result, MVTN can be trained end-to-end with any multi-view network for 3D shape classification. We integrate MVTN into a novel adaptive multi-view pipeline that is capable of rendering both 3D meshes and point clouds. Our approach demonstrates state-of-the-art performance in 3D classification and shape retrieval on several benchmarks (ModelNet40, ScanObjectNN, ShapeNet Core55). Further analysis indicates that our approach exhibits improved robustness to occlusion compared to other methods. We also investigate additional aspects of MVTN, such as 2D pretraining and its use for segmentation. To support further research in this area, we have released MVTorch, a PyTorch library for 3D understanding and generation using multi-view projections.Comment: under review journal extension for the ICCV 2021 paper arXiv:2011.1324

arXiv.org e-Print Archive