Search CORE

12 research outputs found

Leveraging 3D City Models for Rotation Invariant Place-of-Interest Recognition

Author: Baatz Georges
Chen David
Grzeszczuk Radek
Köser Kevin
Pollefeys Marc
Publication venue
Publication date: 18/06/2018
Field of study

Given a cell phone image of a building we address the problem of place-of-interest recognition in urban scenarios. Here, we go beyond what has been shown in earlier approaches by exploiting the nowadays often available 3D building information (e.g. from extruded floor plans) and massive street-level image data for database creation. Exploiting vanishing points in query images and thus fully removing 3D rotation from the recognition problem allows then to simplify the feature invariance to a purely homothetic problem, which we show enables more discriminative power in feature descriptors than classical SIFT. We rerank visual word based document queries using a fast stratified homothetic verification that in most cases boosts the correct document to top positions if it was in the short list. Since we exploit 3D building information, the approach finally outputs the camera pose in real world coordinates ready for augmenting the cell phone image with virtual 3D information. The whole system is demonstrated to outperform traditional approaches on city scale experiments for different sources of street-level image data and a challenging set of cell phone image

RERO DOC Digital Library

Find your Way by Observing the Sun and Other Semantic Cues

Author: Brubaker Marcus A.
Fidler Sanja
Ma Wei-Chiu
Urtasun Raquel
Wang Shenlong
Publication venue
Publication date: 23/06/2016
Field of study

In this paper we present a robust, efficient and affordable approach to self-localization which does not require neither GPS nor knowledge about the appearance of the world. Towards this goal, we utilize freely available cartographic maps and derive a probabilistic model that exploits semantic cues in the form of sun direction, presence of an intersection, road type, speed limit as well as the ego-car trajectory in order to produce very reliable localization results. Our experimental evaluation shows that our approach can localize much faster (in terms of driving time) with less computation and more robustly than competing approaches, which ignore semantic information

arXiv.org e-Print Archive

Crossref

Automatic Registration of RGBD Scans via Salient Directions

Author: Köser Kevin
Pollefeys M.
Zeisl B.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

We address the problem of wide-baseline registration of RGB-D data, such as photo-textured laser scans without any artificial targets or prediction on the relative motion. Our approach allows to fully automatically register scans taken in GPS-denied environments such as urban canyon, industrial facilities or even indoors. We build upon image features which are plenty, localized well and much more discriminative than geometry features; however, they suffer from viewpoint distortions and request for normalization. We utilize the principle of salient directions present in the geometry and propose to extract (several) directions from the distribution of surface normals or other cues such as observable symmetries. Compared to previous work we pose no requirements on the scanned scene (like containing large textured planes) and can handle arbitrary surface shapes. Rendering the whole scene from these repeatable directions using an orthographic camera generates textures which are identical up to 2D similarity transformations. This ambiguity is naturally handled by 2D features and allows to find stable correspondences among scans. For geometric pose estimation from tentative matches we propose a fast and robust 2 point sample consensus scheme integrating an early rejection phase. We evaluate our approach on different challenging real world scenes

OceanRep

Crossref

Real-World Normal Map Capture for Nearly Flat Reflective Surfaces

Author: Häne C.
Jacquet B.
Köser Kevin
Pollefeys M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

Although specular objects have gained interest in recent years, virtually no approaches exist for markerless reconstruction of reflective scenes in the wild. In this work, we present a practical approach to capturing normal maps in real-world scenes using video only. We focus on nearly planar surfaces such as windows, facades from glass or metal, or frames, screens and other indoor objects and show how normal maps of these can be obtained without the use of an artificial calibration object. Rather, we track the reflections of real-world straight lines, while moving with a hand-held or vehicle-mounted camera in front of the object. In contrast to error-prone local edge tracking, we obtain the reflections by a robust, global segmentation technique of an ortho-rectified 3D video cube that also naturally allows efficient user interaction. Then, at each point of the reflective surface, the resulting 2D-curve to 3D-line correspondence provides a novel quadratic constraint on the local surface normal. This allows to globally solve for the shape by integrability and smoothness constraints and easily supports the usage of multiple lines. We demonstrate the technique on several objects and facades

OceanRep

Crossref

Exploiting Sparse Semantic HD Maps for Self-Driving Vehicle Localization

Author: Bai Min
Bârsan Ioan Andrei
Homayounfar Namdar
Lakshmikanth Shrinidhi Kowshika
Ma Wei-Chiu
Mattyus Gellert
Pokrovsky Andrei
Tartavull Ignacio
Urtasun Raquel
Wang Shenlong
Publication venue
Publication date: 08/08/2019
Field of study

In this paper we propose a novel semantic localization algorithm that exploits multiple sensors and has precision on the order of a few centimeters. Our approach does not require detailed knowledge about the appearance of the world, and our maps require orders of magnitude less storage than maps utilized by traditional geometry- and LiDAR intensity-based localizers. This is important as self-driving cars need to operate in large environments. Towards this goal, we formulate the problem in a Bayesian filtering framework, and exploit lanes, traffic signs, as well as vehicle dynamics to localize robustly with respect to a sparse semantic map. We validate the effectiveness of our method on a new highway dataset consisting of 312km of roads. Our experiments show that the proposed approach is able to achieve 0.05m lateral accuracy and 1.12m longitudinal accuracy on average while taking up only 0.3% of the storage required by previous LiDAR intensity-based approaches.Comment: 8 pages, 4 figures, 4 tables, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2019

arXiv.org e-Print Archive

Crossref

Image retrieval by information fusion based on scalable vocabulary tree and robust Hausdorff distance

Author: A Bosch
A Krizhevsky
A Pronobis
A Qamra
B Girod
D Nister
DG Lowe
DG Lowe
DM Chen
DP Huttenlocher
DW Scot
E Tola
E Tola
G Baatz
G Baatz
J Zhang
JC van Gemert
K Mikolajczyk
K Sentz
K-H Yap
L Bo
L Greengard
L Van Gool
MJ Swain
N Khan
O Jesorsky
P Dreuw
R Ji
S Haykin
S Lazebnik
TJ Chin
V-A Nguyen
W Rucklidge
Y LeCun
Y Li
Y Liu
Y Rubner
Z Li
Z Li
Z Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

RGB2LIDAR: Towards Solving Large-Scale Cross-Modal Visual Localization

Author: Arandjelović Relja
Chen Hui
Chiu Han-Pang
Cordts Marius
Cummins Mark J
Faghri Fartash
Gong Yunchao
Hu Sixing
Huang Feiran
Hubert Tsai Yao-Hung
Levinson Jesse
Mahmood Faisal
Mao Junhua
Mithun Niluthpol Chowdhury
Mithun Niluthpol Chowdhury
Mithun Niluthpol Chowdhury
Mithun Niluthpol Chowdhury
Pronobis Andrzej
Razavian Sharif
Rottmann Axel
Schönberger Johannes L
Seymour Zachary
Toft Carl
Wang Tan
Wu Jianixn
Wu Yiling
Zadeh Amir
Zhou Bolei
Zolanvari SM
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 11/09/2020
Field of study

We study an important, yet largely unexplored problem of large-scale cross-modal visual localization by matching ground RGB images to a geo-referenced aerial LIDAR 3D point cloud (rendered as depth images). Prior works were demonstrated on small datasets and did not lend themselves to scaling up for large-scale applications. To enable large-scale evaluation, we introduce a new dataset containing over 550K pairs (covering 143 km^2 area) of RGB and aerial LIDAR depth images. We propose a novel joint embedding based method that effectively combines the appearance and semantic cues from both modalities to handle drastic cross-modal variations. Experiments on the proposed dataset show that our model achieves a strong result of a median rank of 5 in matching across a large test set of 50K location pairs collected from a 14km^2 area. This represents a significant advancement over prior works in performance and scale. We conclude with qualitative results to highlight the challenging nature of this task and the benefits of the proposed model. Our work provides a foundation for further research in cross-modal visual localization.Comment: ACM Multimedia 202

arXiv.org e-Print Archive

Crossref

MAV Urban Localization from Google Street View Data

Author: Albers-Schoenberg Yves
Majdik Andras L.
Scaramuzza Davide
Publication venue
Publication date: 16/06/2014
Field of study

We tackle the problem of globally localizing a camera-equipped micro aerial vehicle flying within urban environments for which a Google Street View image database exists. To avoid the caveats of current image-search algorithms in case of severe viewpoint changes between the query and the database images, we propose to generate virtual views of the scene, which exploit the air-ground geometry of the system. To limit the computational complexity of the algorithm, we rely on a histogram-voting scheme to select the best putative image correspondences. The proposed approach is tested on a 2km image dataset captured with a small quadroctopter flying in the streets of Zurich. The success of our approach shows that our new air-ground matching algorithm can robustly handle extreme changes in viewpoint, illumination, perceptual aliasing, and over-season variations, thus, outperforming conventional visual place-recognition approaches

Infoscience - École polytechnique fédérale de Lausanne

CiteSeerX

Crossref

Leveraging 3D City Models for Rotation Invariant Place-of-Interest Recognition

Author: David Chen
Georges Baatz
Kevin Köser
Marc Pollefeys
Radek Grzeszczuk
Publication venue
Publication date: 01/01/2011
Field of study

CiteSeerX

Repository for Publications and Research Data