2,701 research outputs found
Efficient Decentralized Visual Place Recognition From Full-Image Descriptors
In this paper, we discuss the adaptation of our decentralized place
recognition method described in [1] to full image descriptors. As we had shown,
the key to making a scalable decentralized visual place recognition lies in
exploting deterministic key assignment in a distributed key-value map. Through
this, it is possible to reduce bandwidth by up to a factor of n, the robot
count, by casting visual place recognition to a key-value lookup problem. In
[1], we exploited this for the bag-of-words method [3], [4]. Our method of
casting bag-of-words, however, results in a complex decentralized system, which
has inherently worse recall than its centralized counterpart. In this paper, we
instead start from the recent full-image description method NetVLAD [5]. As we
show, casting this to a key-value lookup problem can be achieved with k-means
clustering, and results in a much simpler system than [1]. The resulting system
still has some flaws, albeit of a completely different nature: it suffers when
the environment seen during deployment lies in a different distribution in
feature space than the environment seen during training.Comment: 3 pages, 4 figures. This is a self-published paper that accompanies
our original work [1] as well as the ICRA 2017 Workshop on Multi-robot
Perception-Driven Control and Planning [2
Data-Efficient Decentralized Visual SLAM
Decentralized visual simultaneous localization and mapping (SLAM) is a
powerful tool for multi-robot applications in environments where absolute
positioning systems are not available. Being visual, it relies on cameras,
cheap, lightweight and versatile sensors, and being decentralized, it does not
rely on communication to a central ground station. In this work, we integrate
state-of-the-art decentralized SLAM components into a new, complete
decentralized visual SLAM system. To allow for data association and
co-optimization, existing decentralized visual SLAM systems regularly exchange
the full map data between all robots, incurring large data transfers at a
complexity that scales quadratically with the robot count. In contrast, our
method performs efficient data association in two stages: in the first stage a
compact full-image descriptor is deterministically sent to only one robot. In
the second stage, which is only executed if the first stage succeeded, the data
required for relative pose estimation is sent, again to only one robot. Thus,
data association scales linearly with the robot count and uses highly compact
place representations. For optimization, a state-of-the-art decentralized
pose-graph optimization method is used. It exchanges a minimum amount of data
which is linear with trajectory overlap. We characterize the resulting system
and identify bottlenecks in its components. The system is evaluated on publicly
available data and we provide open access to the code.Comment: 8 pages, submitted to ICRA 201
Leveraging Deep Visual Descriptors for Hierarchical Efficient Localization
Many robotics applications require precise pose estimates despite operating
in large and changing environments. This can be addressed by visual
localization, using a pre-computed 3D model of the surroundings. The pose
estimation then amounts to finding correspondences between 2D keypoints in a
query image and 3D points in the model using local descriptors. However,
computational power is often limited on robotic platforms, making this task
challenging in large-scale environments. Binary feature descriptors
significantly speed up this 2D-3D matching, and have become popular in the
robotics community, but also strongly impair the robustness to perceptual
aliasing and changes in viewpoint, illumination and scene structure. In this
work, we propose to leverage recent advances in deep learning to perform an
efficient hierarchical localization. We first localize at the map level using
learned image-wide global descriptors, and subsequently estimate a precise pose
from 2D-3D matches computed in the candidate places only. This restricts the
local search and thus allows to efficiently exploit powerful non-binary
descriptors usually dismissed on resource-constrained devices. Our approach
results in state-of-the-art localization performance while running in real-time
on a popular mobile platform, enabling new prospects for robotics research.Comment: CoRL 2018 Camera-ready (fix typos and update citations
CAPRICORN: Communication Aware Place Recognition using Interpretable Constellations of Objects in Robot Networks
Using multiple robots for exploring and mapping environments can provide
improved robustness and performance, but it can be difficult to implement. In
particular, limited communication bandwidth is a considerable constraint when a
robot needs to determine if it has visited a location that was previously
explored by another robot, as it requires for robots to share descriptions of
places they have visited. One way to compress this description is to use
constellations, groups of 3D points that correspond to the estimate of a set of
relative object positions. Constellations maintain the same pattern from
different viewpoints and can be robust to illumination changes or dynamic
elements. We present a method to extract from these constellations compact
spatial and semantic descriptors of the objects in a scene. We use this
representation in a 2-step decentralized loop closure verification: first, we
distribute the compact semantic descriptors to determine which other robots
might have seen scenes with similar objects; then we query matching robots with
the full constellation to validate the match using geometric information. The
proposed method requires less memory, is more interpretable than global image
descriptors, and could be useful for other tasks and interactions with the
environment. We validate our system's performance on a TUM RGB-D SLAM sequence
and show its benefits in terms of bandwidth requirements.Comment: 8 pages, 6 figures, 1 table. 2020 IEEE International Conference on
Robotics and Automation (ICRA
DOOR-SLAM: Distributed, Online, and Outlier Resilient SLAM for Robotic Teams
To achieve collaborative tasks, robots in a team need to have a shared
understanding of the environment and their location within it. Distributed
Simultaneous Localization and Mapping (SLAM) offers a practical solution to
localize the robots without relying on an external positioning system (e.g.
GPS) and with minimal information exchange. Unfortunately, current distributed
SLAM systems are vulnerable to perception outliers and therefore tend to use
very conservative parameters for inter-robot place recognition. However, being
too conservative comes at the cost of rejecting many valid loop closure
candidates, which results in less accurate trajectory estimates. This paper
introduces DOOR-SLAM, a fully distributed SLAM system with an outlier rejection
mechanism that can work with less conservative parameters. DOOR-SLAM is based
on peer-to-peer communication and does not require full connectivity among the
robots. DOOR-SLAM includes two key modules: a pose graph optimizer combined
with a distributed pairwise consistent measurement set maximization algorithm
to reject spurious inter-robot loop closures; and a distributed SLAM front-end
that detects inter-robot loop closures without exchanging raw sensor data. The
system has been evaluated in simulations, benchmarking datasets, and field
experiments, including tests in GPS-denied subterranean environments. DOOR-SLAM
produces more inter-robot loop closures, successfully rejects outliers, and
results in accurate trajectory estimates, while requiring low communication
bandwidth. Full source code is available at
https://github.com/MISTLab/DOOR-SLAM.git.Comment: 8 pages, 11 figures, 2 table
Data Efficient Visual Place Recognition Using Extremely JPEG-Compressed Images
Visual Place Recognition (VPR) is the ability of a robotic platform to
correctly interpret visual stimuli from its on-board cameras in order to
determine whether it is currently located in a previously visited place,
despite different viewpoint, illumination and appearance changes. JPEG is a
widely used image compression standard that is capable of significantly
reducing the size of an image at the cost of image clarity. For applications
where several robotic platforms are simultaneously deployed, the visual data
gathered must be transmitted remotely between each robot. Hence, JPEG
compression can be employed to drastically reduce the amount of data
transmitted over a communication channel, as working with limited bandwidth for
VPR can be proven to be a challenging task. However, the effects of JPEG
compression on the performance of current VPR techniques have not been
previously studied. For this reason, this paper presents an in-depth study of
JPEG compression in VPR related scenarios. We use a selection of
well-established VPR techniques on 8 datasets with various amounts of
compression applied. We show that by introducing compression, the VPR
performance is drastically reduced, especially in the higher spectrum of
compression. To overcome the negative effects of JPEG compression on the VPR
performance, we present a fine-tuned CNN which is optimized for JPEG compressed
data and show that it performs more consistently with the image transformations
detected in extremely compressed JPEG images.Comment: 8 pages, 8 figure
Are State-of-the-art Visual Place Recognition Techniques any Good for Aerial Robotics?
Visual Place Recognition (VPR) has seen significant advances at the frontiers of matching performance and computational superiority over the past few years. However, these evaluations are performed for ground-based mobile platforms and cannot be generalized to aerial platforms. The degree of viewpoint variation experienced by aerial robots is complex, with their processing power and on-board memory limited by payload size and battery ratings. Therefore, in this paper, we collect state-of-the-art VPR techniques that have been previously evaluated for ground-based platforms and compare them on recently proposed aerial place recognition datasets with three prime focuses: a) Matching performance b) Processing power consumption c) Projected memory requirements. This gives a birds-eye view of the applicability of contemporary VPR research to aerial robotics and lays down the the nature of challenges for aerial-VPR
Kimera-Multi: Robust, Distributed, Dense Metric-Semantic SLAM for Multi-Robot Systems
This paper presents Kimera-Multi, the first multi-robot system that (i) is
robust and capable of identifying and rejecting incorrect inter and intra-robot
loop closures resulting from perceptual aliasing, (ii) is fully distributed and
only relies on local (peer-to-peer) communication to achieve distributed
localization and mapping, and (iii) builds a globally consistent
metric-semantic 3D mesh model of the environment in real-time, where faces of
the mesh are annotated with semantic labels. Kimera-Multi is implemented by a
team of robots equipped with visual-inertial sensors. Each robot builds a local
trajectory estimate and a local mesh using Kimera. When communication is
available, robots initiate a distributed place recognition and robust pose
graph optimization protocol based on a novel distributed graduated
non-convexity algorithm. The proposed protocol allows the robots to improve
their local trajectory estimates by leveraging inter-robot loop closures while
being robust to outliers. Finally, each robot uses its improved trajectory
estimate to correct the local mesh using mesh deformation techniques.
We demonstrate Kimera-Multi in photo-realistic simulations, SLAM benchmarking
datasets, and challenging outdoor datasets collected using ground robots. Both
real and simulated experiments involve long trajectories (e.g., up to 800
meters per robot). The experiments show that Kimera-Multi (i) outperforms the
state of the art in terms of robustness and accuracy, (ii) achieves estimation
errors comparable to a centralized SLAM system while being fully distributed,
(iii) is parsimonious in terms of communication bandwidth, (iv) produces
accurate metric-semantic 3D meshes, and (v) is modular and can be also used for
standard 3D reconstruction (i.e., without semantic labels) or for trajectory
estimation (i.e., without reconstructing a 3D mesh).Comment: Accepted by IEEE Transactions on Robotics (18 pages, 15 figures
- …