18 research outputs found
Cross-View Image Matching for Geo-localization in Urban Environments
In this paper, we address the problem of cross-view image geo-localization.
Specifically, we aim to estimate the GPS location of a query street view image
by finding the matching images in a reference database of geo-tagged bird's eye
view images, or vice versa. To this end, we present a new framework for
cross-view image geo-localization by taking advantage of the tremendous success
of deep convolutional neural networks (CNNs) in image classification and object
detection. First, we employ the Faster R-CNN to detect buildings in the query
and reference images. Next, for each building in the query image, we retrieve
the nearest neighbors from the reference buildings using a Siamese network
trained on both positive matching image pairs and negative pairs. To find the
correct NN for each query building, we develop an efficient multiple nearest
neighbors matching method based on dominant sets. We evaluate the proposed
framework on a new dataset that consists of pairs of street view and bird's eye
view images. Experimental results show that the proposed method achieves better
geo-localization accuracy than other approaches and is able to generalize to
images at unseen locations
Coupling ground-level panoramas and aerial imagery for change detection
International audienceGeographic landscapes in all over the world may be subject to rapid changes induced, for instance, by urban, forest, and agricultural evolutions. Monitoring such kind of changes is usually achieved through remote sensing. However, obtaining regular and up-to-date aerial or satellite images is found to be a high costly process, thus preventing regular updating of land cover maps. Alternatively, in this paper, we propose a low-cost solution based on the use of ground-level geo-located landscape panoramic photos providing high spatial resolution information of the scene. Such photos can be acquired from various sources: digital cameras, smartphone, or even web repositories. Furthermore, since the acquisition is performed at the ground level, the users' immediate surroundings, as sensed by a camera device, can provide information at a very high level of precision, enabling to update the land cover type of the geographic area. In the described herein method, we propose to use inverse perspective mapping (inverse warping) to transform the geo-tagged ground-level 360 • photo onto a top-down view as if it had been acquired from a nadiral aerial view. Once re-projected, the warped photo is compared to a previously acquired remotely sensed image using standard techniques such as correlation. Wide differences in orientation, resolution, and geographical extent between the top-down view and the aerial image are addressed through specific processing steps (e.g. registration). Experiments on publicly available data-sets made of both ground-level photos and aerial images show promising results for updating land cover maps with mobile technologies. Finally, the proposed approach contributes to the crowdsourcing efforts in geo-information processing and mapping, providing hints on the evolution of a landscape. ARTICLE HISTOR
Wide-Area Geolocalization with a Limited Field of View Camera in Challenging Urban Environments
Cross-view geolocalization, a supplement or replacement for GPS, localizes an
agent within a search area by matching ground-view images to overhead images.
Significant progress has been made assuming a panoramic ground camera.
Panoramic cameras' high complexity and cost make non-panoramic cameras more
widely applicable, but also more challenging since they yield less scene
overlap between ground and overhead images. This paper presents Restricted FOV
Wide-Area Geolocalization (ReWAG), a cross-view geolocalization approach that
combines a neural network and particle filter to globally localize a mobile
agent with only odometry and a non-panoramic camera. ReWAG creates pose-aware
embeddings and provides a strategy to incorporate particle pose into the
Siamese network, improving localization accuracy by a factor of 100 compared to
a vision transformer baseline. This extended work also presents ReWAG*, which
improves upon ReWAG's generalization ability in previously unseen environments.
ReWAG* repeatedly converges accurately on a dataset of images we have collected
in Boston with a 72 degree field of view (FOV) camera, a location and FOV that
ReWAG* was not trained on.Comment: 10 pages, 16 figures. Extension of ICRA 2023 paper arXiv:2209.1185
Image-based Geolocalization by Ground-to-2.5D Map Matching
We study the image-based geolocalization problem, aiming to localize
ground-view query images on cartographic maps. Current methods often utilize
cross-view localization techniques to match ground-view query images with 2D
maps. However, the performance of these methods is unsatisfactory due to
significant cross-view appearance differences. In this paper, we lift
cross-view matching to a 2.5D space, where heights of structures (e.g., trees
and buildings) provide geometric information to guide the cross-view matching.
We propose a new approach to learning representative embeddings from
multi-modal data. Specifically, we establish a projection relationship between
2.5D space and 2D aerial-view space. The projection is further used to combine
multi-modal features from the 2.5D and 2D maps using an effective
pixel-to-point fusion method. By encoding crucial geometric cues, our method
learns discriminative location embeddings for matching panoramic images and
maps. Additionally, we construct the first large-scale ground-to-2.5D map
geolocalization dataset to validate our method and facilitate future research.
Both single-image based and route based localization experiments are conducted
to test our method. Extensive experiments demonstrate that the proposed method
achieves significantly higher localization accuracy and faster convergence than
previous 2D map-based approaches