405 research outputs found
Semantically Aware Urban 3D Reconstruction with Plane-Based Regularization
We propose a method for urban 3D reconstruction, which incorporates semantic information and plane priors within the reconstruction process in order to generate visually appealing 3D models. We introduce a plane detection algorithm using 3D lines, which detects a more
complete and less spurious plane set compared to point-based methods
in urban environments. Further, the proposed normalized visibility-based
energy formulation eases the combination of several energy terms within
a tetrahedra occupancy labeling algorithm and, hence, is well suited for
combining it with class specific smoothness terms. As a result, we produce visually appealing and detailed building models (i.e., straight edges
and planar surfaces) and a smooth reconstruction of the surroundings
Semantically Guided Depth Upsampling
We present a novel method for accurate and efficient up- sampling of sparse
depth data, guided by high-resolution imagery. Our approach goes beyond the use
of intensity cues only and additionally exploits object boundary cues through
structured edge detection and semantic scene labeling for guidance. Both cues
are combined within a geodesic distance measure that allows for
boundary-preserving depth in- terpolation while utilizing local context. We
model the observed scene structure by locally planar elements and formulate the
upsampling task as a global energy minimization problem. Our method determines
glob- ally consistent solutions and preserves fine details and sharp depth
bound- aries. In our experiments on several public datasets at different levels
of application, we demonstrate superior performance of our approach over the
state-of-the-art, even for very sparse measurements.Comment: German Conference on Pattern Recognition 2016 (Oral
PlaNeRF: SVD Unsupervised 3D Plane Regularization for NeRF Large-Scale Scene Reconstruction
Neural Radiance Fields (NeRF) enable 3D scene reconstruction from 2D images
and camera poses for Novel View Synthesis (NVS). Although NeRF can produce
photorealistic results, it often suffers from overfitting to training views,
leading to poor geometry reconstruction, especially in low-texture areas. This
limitation restricts many important applications which require accurate
geometry, such as extrapolated NVS, HD mapping and scene editing. To address
this limitation, we propose a new method to improve NeRF's 3D structure using
only RGB images and semantic maps. Our approach introduces a novel plane
regularization based on Singular Value Decomposition (SVD), that does not rely
on any geometric prior. In addition, we leverage the Structural Similarity
Index Measure (SSIM) in our loss design to properly initialize the volumetric
representation of NeRF. Quantitative and qualitative results show that our
method outperforms popular regularization approaches in accurate geometry
reconstruction for large-scale outdoor scenes and achieves SoTA rendering
quality on the KITTI-360 NVS benchmark.Comment: 14 pages, 7 figure
Semantically Derived Geometric Constraints for {MVS} Reconstruction of Textureless Areas
Conventional multi-view stereo (MVS) approaches based on photo-consistency measures are generally robust, yet often fail in calculating valid depth pixel estimates in low textured areas of the scene. In this study, a novel approach is proposed to tackle this challenge by leveraging semantic priors into a PatchMatch-based MVS in order to increase confidence and support depth and normal map estimation. Semantic class labels on image pixels are used to impose class-specific geometric constraints during multiview stereo, optimising the depth estimation on weakly supported, textureless areas, commonly present in urban scenarios of building facades, indoor scenes, or aerial datasets. Detecting dominant shapes, e.g., planes, with RANSAC, an adjusted cost function is introduced that combines and weighs both photometric and semantic scores propagating, thus, more accurate depth estimates. Being adaptive, it fills in apparent information gaps and smoothing local roughness in problematic regions while at the same time preserves important details. Experiments on benchmark and custom datasets demonstrate the effectiveness of the presented approach
Lifting GIS Maps into Strong Geometric Context for Scene Understanding
Contextual information can have a substantial impact on the performance of
visual tasks such as semantic segmentation, object detection, and geometric
estimation. Data stored in Geographic Information Systems (GIS) offers a rich
source of contextual information that has been largely untapped by computer
vision. We propose to leverage such information for scene understanding by
combining GIS resources with large sets of unorganized photographs using
Structure from Motion (SfM) techniques. We present a pipeline to quickly
generate strong 3D geometric priors from 2D GIS data using SfM models aligned
with minimal user input. Given an image resectioned against this model, we
generate robust predictions of depth, surface normals, and semantic labels. We
show that the precision of the predicted geometry is substantially more
accurate other single-image depth estimation methods. We then demonstrate the
utility of these contextual constraints for re-scoring pedestrian detections,
and use these GIS contextual features alongside object detection score maps to
improve a CRF-based semantic segmentation framework, boosting accuracy over
baseline models
- …