792 research outputs found
Direct Image to Point Cloud Descriptors Matching for 6-DOF Camera Localization in Dense 3D Point Cloud
We propose a novel concept to directly match feature descriptors extracted
from RGB images, with feature descriptors extracted from 3D point clouds. We
use this concept to localize the position and orientation (pose) of the camera
of a query image in dense point clouds. We generate a dataset of matching 2D
and 3D descriptors, and use it to train a proposed Descriptor-Matcher
algorithm. To localize a query image in a point cloud, we extract 2D keypoints
and descriptors from the query image. Then the Descriptor-Matcher is used to
find the corresponding pairs 2D and 3D keypoints by matching the 2D descriptors
with the pre-extracted 3D descriptors of the point cloud. This information is
used in a robust pose estimation algorithm to localize the query image in the
3D point cloud. Experiments demonstrate that directly matching 2D and 3D
descriptors is not only a viable idea but also achieves competitive accuracy
compared to other state-of-the-art approaches for camera pose localization
Robust Photogeometric Localization over Time for Map-Centric Loop Closure
Map-centric SLAM is emerging as an alternative of conventional graph-based
SLAM for its accuracy and efficiency in long-term mapping problems. However, in
map-centric SLAM, the process of loop closure differs from that of conventional
SLAM and the result of incorrect loop closure is more destructive and is not
reversible. In this paper, we present a tightly coupled photogeometric metric
localization for the loop closure problem in map-centric SLAM. In particular,
our method combines complementary constraints from LiDAR and camera sensors,
and validates loop closure candidates with sequential observations. The
proposed method provides a visual evidence-based outlier rejection where
failures caused by either place recognition or localization outliers can be
effectively removed. We demonstrate the proposed method is not only more
accurate than the conventional global ICP methods but is also robust to
incorrect initial pose guesses.Comment: To Appear in IEEE ROBOTICS AND AUTOMATION LETTERS, ACCEPTED JANUARY
201
EP2P-Loc: End-to-End 3D Point to 2D Pixel Localization for Large-Scale Visual Localization
Visual localization is the task of estimating a 6-DoF camera pose of a query
image within a provided 3D reference map. Thanks to recent advances in various
3D sensors, 3D point clouds are becoming a more accurate and affordable option
for building the reference map, but research to match the points of 3D point
clouds with pixels in 2D images for visual localization remains challenging.
Existing approaches that jointly learn 2D-3D feature matching suffer from low
inliers due to representational differences between the two modalities, and the
methods that bypass this problem into classification have an issue of poor
refinement. In this work, we propose EP2P-Loc, a novel large-scale visual
localization method that mitigates such appearance discrepancy and enables
end-to-end training for pose estimation. To increase the number of inliers, we
propose a simple algorithm to remove invisible 3D points in the image, and find
all 2D-3D correspondences without keypoint detection. To reduce memory usage
and search complexity, we take a coarse-to-fine approach where we extract
patch-level features from 2D images, then perform 2D patch classification on
each 3D point, and obtain the exact corresponding 2D pixel coordinates through
positional encoding. Finally, for the first time in this task, we employ a
differentiable PnP for end-to-end training. In the experiments on newly curated
large-scale indoor and outdoor benchmarks based on 2D-3D-S and KITTI, we show
that our method achieves the state-of-the-art performance compared to existing
visual localization and image-to-point cloud registration methods.Comment: Accepted to ICCV 202
High-Precision Localization Using Ground Texture
Location-aware applications play an increasingly critical role in everyday
life. However, satellite-based localization (e.g., GPS) has limited accuracy
and can be unusable in dense urban areas and indoors. We introduce an
image-based global localization system that is accurate to a few millimeters
and performs reliable localization both indoors and outside. The key idea is to
capture and index distinctive local keypoints in ground textures. This is based
on the observation that ground textures including wood, carpet, tile, concrete,
and asphalt may look random and homogeneous, but all contain cracks, scratches,
or unique arrangements of fibers. These imperfections are persistent, and can
serve as local features. Our system incorporates a downward-facing camera to
capture the fine texture of the ground, together with an image processing
pipeline that locates the captured texture patch in a compact database
constructed offline. We demonstrate the capability of our system to robustly,
accurately, and quickly locate test images on various types of outdoor and
indoor ground surfaces
Scene Coordinate Regression with Angle-Based Reprojection Loss for Camera Relocalization
Image-based camera relocalization is an important problem in computer vision
and robotics. Recent works utilize convolutional neural networks (CNNs) to
regress for pixels in a query image their corresponding 3D world coordinates in
the scene. The final pose is then solved via a RANSAC-based optimization scheme
using the predicted coordinates. Usually, the CNN is trained with ground truth
scene coordinates, but it has also been shown that the network can discover 3D
scene geometry automatically by minimizing single-view reprojection loss.
However, due to the deficiencies of the reprojection loss, the network needs to
be carefully initialized. In this paper, we present a new angle-based
reprojection loss, which resolves the issues of the original reprojection loss.
With this new loss function, the network can be trained without careful
initialization, and the system achieves more accurate results. The new loss
also enables us to utilize available multi-view constraints, which further
improve performance.Comment: ECCV 2018 Workshop (Geometry Meets Deep Learning
- …