Search CORE

2,744 research outputs found

RGB2LIDAR: Towards Solving Large-Scale Cross-Modal Visual Localization

Author: Arandjelović Relja
Chen Hui
Chiu Han-Pang
Cordts Marius
Cummins Mark J
Faghri Fartash
Gong Yunchao
Hu Sixing
Huang Feiran
Hubert Tsai Yao-Hung
Levinson Jesse
Mahmood Faisal
Mao Junhua
Mithun Niluthpol Chowdhury
Mithun Niluthpol Chowdhury
Mithun Niluthpol Chowdhury
Mithun Niluthpol Chowdhury
Pronobis Andrzej
Razavian Sharif
Rottmann Axel
Schönberger Johannes L
Seymour Zachary
Toft Carl
Wang Tan
Wu Jianixn
Wu Yiling
Zadeh Amir
Zhou Bolei
Zolanvari SM
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 11/09/2020
Field of study

We study an important, yet largely unexplored problem of large-scale cross-modal visual localization by matching ground RGB images to a geo-referenced aerial LIDAR 3D point cloud (rendered as depth images). Prior works were demonstrated on small datasets and did not lend themselves to scaling up for large-scale applications. To enable large-scale evaluation, we introduce a new dataset containing over 550K pairs (covering 143 km^2 area) of RGB and aerial LIDAR depth images. We propose a novel joint embedding based method that effectively combines the appearance and semantic cues from both modalities to handle drastic cross-modal variations. Experiments on the proposed dataset show that our model achieves a strong result of a median rank of 5 in matching across a large test set of 50K location pairs collected from a 14km^2 area. This represents a significant advancement over prior works in performance and scale. We conclude with qualitative results to highlight the challenging nature of this task and the benefits of the proposed model. Our work provides a foundation for further research in cross-modal visual localization.Comment: ACM Multimedia 202

arXiv.org e-Print Archive

Sketch Me That Shoe

Author: Hospedales TM
IEEE
Liu F
Loy CC
Song Y-Z
Xiang T
Yu Q
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/04/2016
Field of study

This project received support from the European Union’s Horizon 2020 research and innovation programme under grant agreement #640891, the Royal Society and Natural Science Foundation of China (NSFC) joint grant #IE141387 and #61511130081, and the China Scholarship Council (CSC). We gratefully acknowledge the support of NVIDIA Corporation for the donation of the GPUs used for this research

Surrey Research Insight

Cross-View Visual Geo-Localization for Outdoor Augmented Reality

Author: Chiu Han-Pang
Kumar Rakesh
Minhas Kshitij
Mithun Niluthpol Chowdhury
Oskiper Taragay
Samarasekera Supun
Sizintsev Mikhail
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 27/03/2023
Field of study

Precise estimation of global orientation and location is critical to ensure a compelling outdoor Augmented Reality (AR) experience. We address the problem of geo-pose estimation by cross-view matching of query ground images to a geo-referenced aerial satellite image database. Recently, neural network-based methods have shown state-of-the-art performance in cross-view matching. However, most of the prior works focus only on location estimation, ignoring orientation, which cannot meet the requirements in outdoor AR applications. We propose a new transformer neural network-based model and a modified triplet ranking loss for joint location and orientation estimation. Experiments on several benchmark cross-view geo-localization datasets show that our model achieves state-of-the-art performance. Furthermore, we present an approach to extend the single image query-based geo-localization approach by utilizing temporal information from a navigation pipeline for robust continuous geo-localization. Experimentation on several large-scale real-world video sequences demonstrates that our approach enables high-precision and stable AR insertion.Comment: IEEE VR 202

arXiv.org e-Print Archive

Beyond Geo-localization: Fine-grained Orientation of Street-view Images by Cross-view Matching with Satellite Imagery

Author: Georgescu Andrei
Hu Wenmiao
Kruppa Hannes
Liang Yuxuan
Ng See-Kiong
Tran An
Yin Yifang
Zhang Yichen
Zimmermann Roger
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 07/07/2023
Field of study

Street-view imagery provides us with novel experiences to explore different places remotely. Carefully calibrated street-view images (e.g. Google Street View) can be used for different downstream tasks, e.g. navigation, map features extraction. As personal high-quality cameras have become much more affordable and portable, an enormous amount of crowdsourced street-view images are uploaded to the internet, but commonly with missing or noisy sensor information. To prepare this hidden treasure for "ready-to-use" status, determining missing location information and camera orientation angles are two equally important tasks. Recent methods have achieved high performance on geo-localization of street-view images by cross-view matching with a pool of geo-referenced satellite imagery. However, most of the existing works focus more on geo-localization than estimating the image orientation. In this work, we re-state the importance of finding fine-grained orientation for street-view images, formally define the problem and provide a set of evaluation metrics to assess the quality of the orientation estimation. We propose two methods to improve the granularity of the orientation estimation, achieving 82.4% and 72.3% accuracy for images with estimated angle errors below 2 degrees for CVUSA and CVACT datasets, corresponding to 34.9% and 28.2% absolute improvement compared to previous works. Integrating fine-grained orientation estimation in training also improves the performance on geo-localization, giving top 1 recall 95.5%/85.5% and 86.8%/80.4% for orientation known/unknown tests on the two datasets.Comment: This paper has been accepted by ACM Multimedia 2022. The version contains additional supplementary material

arXiv.org e-Print Archive