Search CORE

908 research outputs found

Benchmarking 6DOF Outdoor Visual Localization in Changing Conditions

Author: Hammarstrand Lars
Kahl Fredrik
Maddern Will
Okutomi Masatoshi
Pajdla Tomas
Pollefeys Marc
Safari Daniel
Sattler Torsten
Sivic Josef
Stenborg Erik
Toft Carl
Torii Akihiko
Publication venue
Publication date: 01/01/2018
Field of study

Visual localization enables autonomous vehicles to navigate in their surroundings and augmented reality applications to link virtual to real worlds. Practical visual localization approaches need to be robust to a wide variety of viewing condition, including day-night changes, as well as weather and seasonal variations, while providing highly accurate 6 degree-of-freedom (6DOF) camera pose estimates. In this paper, we introduce the first benchmark datasets specifically designed for analyzing the impact of such factors on visual localization. Using carefully created ground truth poses for query images taken under a wide variety of conditions, we evaluate the impact of various factors on 6DOF camera pose estimation accuracy through extensive experiments with state-of-the-art localization approaches. Based on our results, we draw conclusions about the difficulty of different conditions, showing that long-term localization is far from solved, and propose promising avenues for future work, including sequence-based localization approaches and the need for better local features. Our benchmark is available at visuallocalization.net.Comment: Accepted to CVPR 2018 as a spotligh

arXiv.org e-Print Archive

Lund University Publications

Crossref

INRIA a CCSD electronic archive server

Chalmers Research

The simultaneous localization and mapping (SLAM):An overview

Author: Alsadik B.
Karam S.
Publication venue
Publication date: 18/05/2021
Field of study

Positioning is a need for many applications related to mapping and navigation either in civilian or military domains. The significant developments in satellite-based techniques, sensors, telecommunications, computer hardware and software, image processing, etc. positively influenced to solve the positioning problem efficiently and instantaneously. Accordingly, the mentioned development empowered the applications and advancement of autonomous navigation. One of the most interesting developed positioning techniques is what is called in robotics as the Simultaneous Localization and Mapping SLAM. The SLAM problem solution has witnessed a quick improvement in the last decades either using active sensors like the RAdio Detection And Ranging (Radar) and Light Detection and Ranging (LiDAR) or passive sensors like cameras. Definitely, positioning and mapping is one of the main tasks for Geomatics engineers, and therefore it's of high importance for them to understand the SLAM topic which is not easy because of the huge documentation and algorithms available and the various SLAM solutions in terms of the mathematical models, complexity, the sensors used, and the type of applications. In this paper, a clear and simplified explanation is introduced about SLAM from a Geomatical viewpoint avoiding going into the complicated algorithmic details behind the presented techniques. In this way, a general overview of SLAM is presented showing the relationship between its different components and stages like the core part of the front-end and back-end and their relation to the SLAM paradigm. Furthermore, we explain the major mathematical techniques of filtering and pose graph optimization either using visual or LiDAR SLAM and introduce a summary of the deep learning efficient contribution to the SLAM problem. Finally, we address examples of some existing practical applications of SLAM in our reality

University of Twente Research Information

Sparse-to-Dense: Depth Prediction from Sparse Depth Samples and a Single Image

Author: Karaman Sertac
Ma Fangchang
Publication venue
Publication date: 25/02/2018
Field of study

We consider the problem of dense depth prediction from a sparse set of depth measurements and a single RGB image. Since depth estimation from monocular images alone is inherently ambiguous and unreliable, to attain a higher level of robustness and accuracy, we introduce additional sparse depth samples, which are either acquired with a low-resolution depth sensor or computed via visual Simultaneous Localization and Mapping (SLAM) algorithms. We propose the use of a single deep regression network to learn directly from the RGB-D raw data, and explore the impact of number of depth samples on prediction accuracy. Our experiments show that, compared to using only RGB images, the addition of 100 spatially random depth samples reduces the prediction root-mean-square error by 50% on the NYU-Depth-v2 indoor dataset. It also boosts the percentage of reliable prediction from 59% to 92% on the KITTI dataset. We demonstrate two applications of the proposed algorithm: a plug-in module in SLAM to convert sparse maps to dense maps, and super-resolution for LiDARs. Software and video demonstration are publicly available.Comment: accepted to ICRA 2018. 8 pages, 8 figures, 3 tables. Video at https://www.youtube.com/watch?v=vNIIT_M7x7Y. Code at https://github.com/fangchangma/sparse-to-dens

arXiv.org e-Print Archive

Crossref

DSpace@MIT

Camera Pose Estimation from Street-view Snapshots and Point Clouds

Author: Fu Junsheng
Publication venue: Tampere University
Publication date: 03/06/2022
Field of study

This PhD thesis targets on two research problems: (1) How to efﬁciently and robustly estimate the camera pose of a query image with a map that contains street-view snapshots and point clouds; (2) Given the estimated camera pose of a query image, how to create meaningful and intuitive applications with the map data. To conquer the ﬁrst research problem, we systematically investigated indirect, direct and hybrid camera pose estimation strategies. We implemented state-of-the-art methods and performed comprehensive experiments in two public benchmark datasets considering outdoor environmental changes from ideal to extremely challenging cases. Our key ﬁndings are: (1) the indirect method is usually more accurate than the direct method when there are enough consistent feature correspondences; (2) The direct method is sensitive to initialization, but under extreme outdoor environmental changes, the mutual-information-based direct method is more robust than the feature-based methods; (3) The hybrid method combines the strength from both direct and indirect method and outperforms them in challenging datasets. To explore the second research problem, we considered inspiring and useful applications by exploiting the camera pose together with the map data. Firstly, we invented a 3D-map augmented photo gallery application, where images’ geo-meta data are extracted with an indirect camera pose estimation method and photo sharing experience is improved with the augmentation of 3D map. Secondly, we designed an interactive video playback application, where an indirect method estimates video frames’ camera pose and the video playback is augmented with a 3D map. Thirdly, we proposed a 3D visual primitive based indoor object and outdoor scene recognition method, where the 3D primitives are accumulated from the multiview images

Trepo - Institutional Repository of Tampere University

Semantic Visual Localization

Author: Geiger Andreas
Pollefeys Marc
Sattler Torsten
Schönberger Johannes L.
Publication venue
Publication date: 01/01/2018
Field of study

Robust visual localization under a wide range of viewing conditions is a fundamental problem in computer vision. Handling the difficult cases of this problem is not only very challenging but also of high practical relevance, e.g., in the context of life-long localization for augmented reality or autonomous robots. In this paper, we propose a novel approach based on a joint 3D geometric and semantic understanding of the world, enabling it to succeed under conditions where previous approaches failed. Our method leverages a novel generative model for descriptor learning, trained on semantic scene completion as an auxiliary task. The resulting 3D descriptors are robust to missing observations by encoding high-level 3D geometric and semantic information. Experiments on several challenging large-scale localization datasets demonstrate reliable localization under extreme viewpoint, illumination, and geometry changes

arXiv.org e-Print Archive

MPG.PuRe

Understanding the Limitations of CNN-based Absolute Camera Pose Regression

Author: Leal-Taixe Laura
Pollefeys Marc
Sattler Torsten
Zhou Qunjie
Publication venue
Publication date: 01/01/2019
Field of study

Visual localization is the task of accurate camera pose estimation in a known scene. It is a key problem in computer vision and robotics, with applications including self-driving cars, Structure-from-Motion, SLAM, and Mixed Reality. Traditionally, the localization problem has been tackled using 3D geometry. Recently, end-to-end approaches based on convolutional neural networks have become popular. These methods learn to directly regress the camera pose from an input image. However, they do not achieve the same level of pose accuracy as 3D structure-based methods. To understand this behavior, we develop a theoretical model for camera pose regression. We use our model to predict failure cases for pose regression techniques and verify our predictions through experiments. We furthermore use our model to show that pose regression is more closely related to pose approximation via image retrieval than to accurate pose estimation via 3D structure. A key result is that current approaches do not consistently outperform a handcrafted image retrieval baseline. This clearly shows that additional research is needed before pose regression algorithms are ready to compete with structure-based methods.Comment: Initial version of a paper accepted to CVPR 201

arXiv.org e-Print Archive

Crossref

Chalmers Research