1,361 research outputs found
Weakly supervised 3D Reconstruction with Adversarial Constraint
Supervised 3D reconstruction has witnessed a significant progress through the
use of deep neural networks. However, this increase in performance requires
large scale annotations of 2D/3D data. In this paper, we explore inexpensive 2D
supervision as an alternative for expensive 3D CAD annotation. Specifically, we
use foreground masks as weak supervision through a raytrace pooling layer that
enables perspective projection and backpropagation. Additionally, since the 3D
reconstruction from masks is an ill posed problem, we propose to constrain the
3D reconstruction to the manifold of unlabeled realistic 3D shapes that match
mask observations. We demonstrate that learning a log-barrier solution to this
constrained optimization problem resembles the GAN objective, enabling the use
of existing tools for training GANs. We evaluate and analyze the manifold
constrained reconstruction on various datasets for single and multi-view
reconstruction of both synthetic and real images
EvAC3D: From Event-based Apparent Contours to 3D Models via Continuous Visual Hulls
3D reconstruction from multiple views is a successful computer vision field
with multiple deployments in applications. State of the art is based on
traditional RGB frames that enable optimization of photo-consistency cross
views. In this paper, we study the problem of 3D reconstruction from
event-cameras, motivated by the advantages of event-based cameras in terms of
low power and latency as well as by the biological evidence that eyes in nature
capture the same data and still perceive well 3D shape. The foundation of our
hypothesis that 3D reconstruction is feasible using events lies in the
information contained in the occluding contours and in the continuous scene
acquisition with events. We propose Apparent Contour Events (ACE), a novel
event-based representation that defines the geometry of the apparent contour of
an object. We represent ACE by a spatially and temporally continuous implicit
function defined in the event x-y-t space. Furthermore, we design a novel
continuous Voxel Carving algorithm enabled by the high temporal resolution of
the Apparent Contour Events. To evaluate the performance of the method, we
collect MOEC-3D, a 3D event dataset of a set of common real-world objects. We
demonstrate the ability of EvAC3D to reconstruct high-fidelity mesh surfaces
from real event sequences while allowing the refinement of the 3D
reconstruction for each individual event.Comment: 16 pages, 8 figures, European Conference on Computer Vision (ECCV)
202
Lifting GIS Maps into Strong Geometric Context for Scene Understanding
Contextual information can have a substantial impact on the performance of
visual tasks such as semantic segmentation, object detection, and geometric
estimation. Data stored in Geographic Information Systems (GIS) offers a rich
source of contextual information that has been largely untapped by computer
vision. We propose to leverage such information for scene understanding by
combining GIS resources with large sets of unorganized photographs using
Structure from Motion (SfM) techniques. We present a pipeline to quickly
generate strong 3D geometric priors from 2D GIS data using SfM models aligned
with minimal user input. Given an image resectioned against this model, we
generate robust predictions of depth, surface normals, and semantic labels. We
show that the precision of the predicted geometry is substantially more
accurate other single-image depth estimation methods. We then demonstrate the
utility of these contextual constraints for re-scoring pedestrian detections,
and use these GIS contextual features alongside object detection score maps to
improve a CRF-based semantic segmentation framework, boosting accuracy over
baseline models
SCADE: NeRFs from Space Carving with Ambiguity-Aware Depth Estimates
Neural radiance fields (NeRFs) have enabled high fidelity 3D reconstruction
from multiple 2D input views. However, a well-known drawback of NeRFs is the
less-than-ideal performance under a small number of views, due to insufficient
constraints enforced by volumetric rendering. To address this issue, we
introduce SCADE, a novel technique that improves NeRF reconstruction quality on
sparse, unconstrained input views for in-the-wild indoor scenes. To constrain
NeRF reconstruction, we leverage geometric priors in the form of per-view depth
estimates produced with state-of-the-art monocular depth estimation models,
which can generalize across scenes. A key challenge is that monocular depth
estimation is an ill-posed problem, with inherent ambiguities. To handle this
issue, we propose a new method that learns to predict, for each view, a
continuous, multimodal distribution of depth estimates using conditional
Implicit Maximum Likelihood Estimation (cIMLE). In order to disambiguate
exploiting multiple views, we introduce an original space carving loss that
guides the NeRF representation to fuse multiple hypothesized depth maps from
each view and distill from them a common geometry that is consistent with all
views. Experiments show that our approach enables higher fidelity novel view
synthesis from sparse views. Our project page can be found at
https://scade-spacecarving-nerfs.github.io .Comment: CVPR 202
BSP-fields: An Exact Representation of Polygonal Objects by Differentiable Scalar Fields Based on Binary Space Partitioning
The problem considered in this work is to find a dimension independent algorithm for the generation of signed scalar fields exactly representing polygonal objects and satisfying the following requirements: the defining real function takes zero value exactly at the polygonal object boundary; no extra zero-value isosurfaces should be generated; C1 continuity of the function in the entire domain. The proposed algorithms are based on the binary space partitioning (BSP) of the object by the planes passing through the polygonal faces and are independent of the object genus, the number of disjoint components, and holes in the initial polygonal mesh. Several extensions to the basic algorithm are proposed to satisfy the selected optimization criteria. The generated BSP-fields allow for applying techniques of the function-based modeling to already existing legacy objects from CAD and computer animation areas, which is illustrated by several examples
Building Proteins in a Day: Efficient 3D Molecular Reconstruction
Discovering the 3D atomic structure of molecules such as proteins and viruses
is a fundamental research problem in biology and medicine. Electron
Cryomicroscopy (Cryo-EM) is a promising vision-based technique for structure
estimation which attempts to reconstruct 3D structures from 2D images. This
paper addresses the challenging problem of 3D reconstruction from 2D Cryo-EM
images. A new framework for estimation is introduced which relies on modern
stochastic optimization techniques to scale to large datasets. We also
introduce a novel technique which reduces the cost of evaluating the objective
function during optimization by over five orders or magnitude. The net result
is an approach capable of estimating 3D molecular structure from large scale
datasets in about a day on a single workstation.Comment: To be presented at IEEE Conference on Computer Vision and Pattern
Recognition (CVPR) 201
- …