523 research outputs found
Weakly supervised 3D Reconstruction with Adversarial Constraint
Supervised 3D reconstruction has witnessed a significant progress through the
use of deep neural networks. However, this increase in performance requires
large scale annotations of 2D/3D data. In this paper, we explore inexpensive 2D
supervision as an alternative for expensive 3D CAD annotation. Specifically, we
use foreground masks as weak supervision through a raytrace pooling layer that
enables perspective projection and backpropagation. Additionally, since the 3D
reconstruction from masks is an ill posed problem, we propose to constrain the
3D reconstruction to the manifold of unlabeled realistic 3D shapes that match
mask observations. We demonstrate that learning a log-barrier solution to this
constrained optimization problem resembles the GAN objective, enabling the use
of existing tools for training GANs. We evaluate and analyze the manifold
constrained reconstruction on various datasets for single and multi-view
reconstruction of both synthetic and real images
Neural apparent BRDF fields for multiview photometric stereo
We propose to tackle the multiview photometric stereo problem using an
extension of Neural Radiance Fields (NeRFs), conditioned on light source
direction. The geometric part of our neural representation predicts surface
normal direction, allowing us to reason about local surface reflectance. The
appearance part of our neural representation is decomposed into a neural
bidirectional reflectance function (BRDF), learnt as part of the fitting
process, and a shadow prediction network (conditioned on light source
direction) allowing us to model the apparent BRDF. This balance of learnt
components with inductive biases based on physical image formation models
allows us to extrapolate far from the light source and viewer directions
observed during training. We demonstrate our approach on a multiview
photometric stereo benchmark and show that competitive performance can be
obtained with the neural density representation of a NeRF.Comment: 9 pages, 6 figures, 1 tabl
DiVA-360: The Dynamic Visuo-Audio Dataset for Immersive Neural Fields
Advances in neural fields are enabling high-fidelity capture of the shape and
appearance of static and dynamic scenes. However, their capabilities lag behind
those offered by representations such as pixels or meshes due to algorithmic
challenges and the lack of large-scale real-world datasets. We address the
dataset limitation with DiVA-360, a real-world 360 dynamic visual-audio dataset
with synchronized multimodal visual, audio, and textual information about
table-scale scenes. It contains 46 dynamic scenes, 30 static scenes, and 95
static objects spanning 11 categories captured using a new hardware system
using 53 RGB cameras at 120 FPS and 6 microphones for a total of 8.6M image
frames and 1360 s of dynamic data. We provide detailed text descriptions for
all scenes, foreground-background segmentation masks, category-specific 3D pose
alignment for static objects, as well as metrics for comparison. Our data,
hardware and software, and code are available at https://diva360.github.io/
Deep-Learning-Based 3-D Surface Reconstruction—A Survey
In the last decade, deep learning (DL) has significantly impacted industry and science. Initially largely motivated by computer vision tasks in 2-D imagery, the focus has shifted toward 3-D data analysis. In particular, 3-D surface reconstruction, i.e., reconstructing a 3-D shape from sparse input, is of great interest to a large variety of application fields. DL-based approaches show promising quantitative and qualitative surface reconstruction performance compared to traditional computer vision and geometric algorithms. This survey provides a comprehensive overview of these DL-based methods for 3-D surface reconstruction. To this end, we will first discuss input data modalities, such as volumetric data, point clouds, and RGB, single-view, multiview, and depth images, along with corresponding acquisition technologies and common benchmark datasets. For practical purposes, we also discuss evaluation metrics enabling us to judge the reconstructive performance of different methods. The main part of the document will introduce a methodological taxonomy ranging from point- and mesh-based techniques to volumetric and implicit neural approaches. Recent research trends, both methodological and for applications, are highlighted, pointing toward future developments
- …