412 research outputs found
Leveraging Deep Visual Descriptors for Hierarchical Efficient Localization
Many robotics applications require precise pose estimates despite operating
in large and changing environments. This can be addressed by visual
localization, using a pre-computed 3D model of the surroundings. The pose
estimation then amounts to finding correspondences between 2D keypoints in a
query image and 3D points in the model using local descriptors. However,
computational power is often limited on robotic platforms, making this task
challenging in large-scale environments. Binary feature descriptors
significantly speed up this 2D-3D matching, and have become popular in the
robotics community, but also strongly impair the robustness to perceptual
aliasing and changes in viewpoint, illumination and scene structure. In this
work, we propose to leverage recent advances in deep learning to perform an
efficient hierarchical localization. We first localize at the map level using
learned image-wide global descriptors, and subsequently estimate a precise pose
from 2D-3D matches computed in the candidate places only. This restricts the
local search and thus allows to efficiently exploit powerful non-binary
descriptors usually dismissed on resource-constrained devices. Our approach
results in state-of-the-art localization performance while running in real-time
on a popular mobile platform, enabling new prospects for robotics research.Comment: CoRL 2018 Camera-ready (fix typos and update citations
Training a Convolutional Neural Network for Appearance-Invariant Place Recognition
Place recognition is one of the most challenging problems in computer vision,
and has become a key part in mobile robotics and autonomous driving
applications for performing loop closure in visual SLAM systems. Moreover, the
difficulty of recognizing a revisited location increases with appearance
changes caused, for instance, by weather or illumination variations, which
hinders the long-term application of such algorithms in real environments. In
this paper we present a convolutional neural network (CNN), trained for the
first time with the purpose of recognizing revisited locations under severe
appearance changes, which maps images to a low dimensional space where
Euclidean distances represent place dissimilarity. In order for the network to
learn the desired invariances, we train it with triplets of images selected
from datasets which present a challenging variability in visual appearance. The
triplets are selected in such way that two samples are from the same location
and the third one is taken from a different place. We validate our system
through extensive experimentation, where we demonstrate better performance than
state-of-art algorithms in a number of popular datasets
Closed-Loop Learning of Visual Control Policies
In this paper we present a general, flexible framework for learning mappings
from images to actions by interacting with the environment. The basic idea is
to introduce a feature-based image classifier in front of a reinforcement
learning algorithm. The classifier partitions the visual space according to the
presence or absence of few highly informative local descriptors that are
incrementally selected in a sequence of attempts to remove perceptual aliasing.
We also address the problem of fighting overfitting in such a greedy algorithm.
Finally, we show how high-level visual features can be generated when the power
of local descriptors is insufficient for completely disambiguating the aliased
states. This is done by building a hierarchy of composite features that consist
of recursive spatial combinations of visual features. We demonstrate the
efficacy of our algorithms by solving three visual navigation tasks and a
visual version of the classical Car on the Hill control problem
DOOR-SLAM: Distributed, Online, and Outlier Resilient SLAM for Robotic Teams
To achieve collaborative tasks, robots in a team need to have a shared
understanding of the environment and their location within it. Distributed
Simultaneous Localization and Mapping (SLAM) offers a practical solution to
localize the robots without relying on an external positioning system (e.g.
GPS) and with minimal information exchange. Unfortunately, current distributed
SLAM systems are vulnerable to perception outliers and therefore tend to use
very conservative parameters for inter-robot place recognition. However, being
too conservative comes at the cost of rejecting many valid loop closure
candidates, which results in less accurate trajectory estimates. This paper
introduces DOOR-SLAM, a fully distributed SLAM system with an outlier rejection
mechanism that can work with less conservative parameters. DOOR-SLAM is based
on peer-to-peer communication and does not require full connectivity among the
robots. DOOR-SLAM includes two key modules: a pose graph optimizer combined
with a distributed pairwise consistent measurement set maximization algorithm
to reject spurious inter-robot loop closures; and a distributed SLAM front-end
that detects inter-robot loop closures without exchanging raw sensor data. The
system has been evaluated in simulations, benchmarking datasets, and field
experiments, including tests in GPS-denied subterranean environments. DOOR-SLAM
produces more inter-robot loop closures, successfully rejects outliers, and
results in accurate trajectory estimates, while requiring low communication
bandwidth. Full source code is available at
https://github.com/MISTLab/DOOR-SLAM.git.Comment: 8 pages, 11 figures, 2 table
Incremental topological mapping using omnidirectional vision
This paper presents an algorithm that builds topological maps, using omnidirectional vision as the only sensor modality. Local features are extracted from images obtained in sequence, and are used both to cluster the images into nodes and to detect links between the nodes. The algorithm is incremental, reducing the computational requirements of the corresponding batch algorithm. Experimental results in a complex, indoor environment show that the algorithm produces topologically correct maps, closing loops without suffering from perceptual aliasing or false links. Robustness to lighting variations was further demonstrated by building correct maps from combined multiple datasets collected over a period of 2 month
- …