4,879 research outputs found
Semantic 3D Occupancy Mapping through Efficient High Order CRFs
Semantic 3D mapping can be used for many applications such as robot
navigation and virtual interaction. In recent years, there has been great
progress in semantic segmentation and geometric 3D mapping. However, it is
still challenging to combine these two tasks for accurate and large-scale
semantic mapping from images. In the paper, we propose an incremental and
(near) real-time semantic mapping system. A 3D scrolling occupancy grid map is
built to represent the world, which is memory and computationally efficient and
bounded for large scale environments. We utilize the CNN segmentation as prior
prediction and further optimize 3D grid labels through a novel CRF model.
Superpixels are utilized to enforce smoothness and form robust P N high order
potential. An efficient mean field inference is developed for the graph
optimization. We evaluate our system on the KITTI dataset and improve the
segmentation accuracy by 10% over existing systems.Comment: IROS 201
FrameNet: Learning Local Canonical Frames of 3D Surfaces from a Single RGB Image
In this work, we introduce the novel problem of identifying dense canonical
3D coordinate frames from a single RGB image. We observe that each pixel in an
image corresponds to a surface in the underlying 3D geometry, where a canonical
frame can be identified as represented by three orthogonal axes, one along its
normal direction and two in its tangent plane. We propose an algorithm to
predict these axes from RGB. Our first insight is that canonical frames
computed automatically with recently introduced direction field synthesis
methods can provide training data for the task. Our second insight is that
networks designed for surface normal prediction provide better results when
trained jointly to predict canonical frames, and even better when trained to
also predict 2D projections of canonical frames. We conjecture this is because
projections of canonical tangent directions often align with local gradients in
images, and because those directions are tightly linked to 3D canonical frames
through projective geometry and orthogonality constraints. In our experiments,
we find that our method predicts 3D canonical frames that can be used in
applications ranging from surface normal estimation, feature matching, and
augmented reality
Occlusion-Aware Depth Estimation with Adaptive Normal Constraints
We present a new learning-based method for multi-frame depth estimation from
a color video, which is a fundamental problem in scene understanding, robot
navigation or handheld 3D reconstruction. While recent learning-based methods
estimate depth at high accuracy, 3D point clouds exported from their depth maps
often fail to preserve important geometric feature (e.g., corners, edges,
planes) of man-made scenes. Widely-used pixel-wise depth errors do not
specifically penalize inconsistency on these features. These inaccuracies are
particularly severe when subsequent depth reconstructions are accumulated in an
attempt to scan a full environment with man-made objects with this kind of
features. Our depth estimation algorithm therefore introduces a Combined Normal
Map (CNM) constraint, which is designed to better preserve high-curvature
features and global planar regions. In order to further improve the depth
estimation accuracy, we introduce a new occlusion-aware strategy that
aggregates initial depth predictions from multiple adjacent views into one
final depth map and one occlusion probability map for the current reference
view. Our method outperforms the state-of-the-art in terms of depth estimation
accuracy, and preserves essential geometric features of man-made indoor scenes
much better than other algorithms.Comment: ECCV 202
- …