1,728 research outputs found
Is Geometry Enough for Matching in Visual Localization?
In this paper, we propose to go beyond the well-established approach to
vision-based localization that relies on visual descriptor matching between a
query image and a 3D point cloud. While matching keypoints via visual
descriptors makes localization highly accurate, it has significant storage
demands, raises privacy concerns and requires update to the descriptors in the
long-term. To elegantly address those practical challenges for large-scale
localization, we present GoMatch, an alternative to visual-based matching that
solely relies on geometric information for matching image keypoints to maps,
represented as sets of bearing vectors. Our novel bearing vectors
representation of 3D points, significantly relieves the cross-modal challenge
in geometric-based matching that prevented prior work to tackle localization in
a realistic environment. With additional careful architecture design, GoMatch
improves over prior geometric-based matching work with a reduction of
(10.67m,95.7deg) and (1.43m, 34.7deg) in average median pose errors on
Cambridge Landmarks and 7-Scenes, while requiring as little as 1.5/1.7% of
storage capacity in comparison to the best visual-based matching methods. This
confirms its potential and feasibility for real-world localization and opens
the door to future efforts in advancing city-scale visual localization methods
that do not require storing visual descriptors.Comment: ECCV2022 Camera Read
You are here! Finding position and orientation on a 2D map from a single image: The Flatlandia localization problem and dataset
We introduce Flatlandia, a novel problem for visual localization of an image
from object detections composed of two specific tasks: i) Coarse Map
Localization: localizing a single image observing a set of objects in respect
to a 2D map of object landmarks; ii) Fine-grained 3DoF Localization: estimating
latitude, longitude, and orientation of the image within a 2D map. Solutions
for these new tasks exploit the wide availability of open urban maps annotated
with GPS locations of common objects (\eg via surveying or crowd-sourced). Such
maps are also more storage-friendly than standard large-scale 3D models often
used in visual localization while additionally being privacy-preserving. As
existing datasets are unsuited for the proposed problem, we provide the
Flatlandia dataset, designed for 3DoF visual localization in multiple urban
settings and based on crowd-sourced data from five European cities. We use the
Flatlandia dataset to validate the complexity of the proposed tasks
The Need for Inherently Privacy-Preserving Vision in Trustworthy Autonomous Systems
Vision is a popular and effective sensor for robotics from which we can
derive rich information about the environment: the geometry and semantics of
the scene, as well as the age, gender, identity, activity and even emotional
state of humans within that scene. This raises important questions about the
reach, lifespan, and potential misuse of this information. This paper is a call
to action to consider privacy in the context of robotic vision. We propose a
specific form privacy preservation in which no images are captured or could be
reconstructed by an attacker even with full remote access. We present a set of
principles by which such systems can be designed, and through a case study in
localisation demonstrate in simulation a specific implementation that delivers
an important robotic capability in an inherently privacy-preserving manner.
This is a first step, and we hope to inspire future works that expand the range
of applications open to sighted robotic systems.Comment: 7 pages, 6 figure
Privacy-Preserving Visual Localization with Event Cameras
We present a robust, privacy-preserving visual localization algorithm using
event cameras. While event cameras can potentially make robust localization due
to high dynamic range and small motion blur, the sensors exhibit large domain
gaps making it difficult to directly apply conventional image-based
localization algorithms. To mitigate the gap, we propose applying
event-to-image conversion prior to localization which leads to stable
localization. In the privacy perspective, event cameras capture only a fraction
of visual information compared to normal cameras, and thus can naturally hide
sensitive visual details. To further enhance the privacy protection in our
event-based pipeline, we introduce privacy protection at two levels, namely
sensor and network level. Sensor level protection aims at hiding facial details
with lightweight filtering while network level protection targets hiding the
entire user's view in private scene applications using a novel neural network
inference pipeline. Both levels of protection involve light-weight computation
and incur only a small performance loss. We thus project our method to serve as
a building block for practical location-based services using event cameras. The
code and dataset will be made public through the following link:
https://github.com/82magnolia/event_localization
- …