234 research outputs found
A Unified Framework for Mutual Improvement of SLAM and Semantic Segmentation
This paper presents a novel framework for simultaneously implementing
localization and segmentation, which are two of the most important vision-based
tasks for robotics. While the goals and techniques used for them were
considered to be different previously, we show that by making use of the
intermediate results of the two modules, their performance can be enhanced at
the same time. Our framework is able to handle both the instantaneous motion
and long-term changes of instances in localization with the help of the
segmentation result, which also benefits from the refined 3D pose information.
We conduct experiments on various datasets, and prove that our framework works
effectively on improving the precision and robustness of the two tasks and
outperforms existing localization and segmentation algorithms.Comment: 7 pages, 5 figures.This work has been accepted by ICRA 2019. The demo
video can be found at https://youtu.be/Bkt53dAehj
Dynamic Body VSLAM with Semantic Constraints
Image based reconstruction of urban environments is a challenging problem
that deals with optimization of large number of variables, and has several
sources of errors like the presence of dynamic objects. Since most large scale
approaches make the assumption of observing static scenes, dynamic objects are
relegated to the noise modeling section of such systems. This is an approach of
convenience since the RANSAC based framework used to compute most multiview
geometric quantities for static scenes naturally confine dynamic objects to the
class of outlier measurements. However, reconstructing dynamic objects along
with the static environment helps us get a complete picture of an urban
environment. Such understanding can then be used for important robotic tasks
like path planning for autonomous navigation, obstacle tracking and avoidance,
and other areas. In this paper, we propose a system for robust SLAM that works
in both static and dynamic environments. To overcome the challenge of dynamic
objects in the scene, we propose a new model to incorporate semantic
constraints into the reconstruction algorithm. While some of these constraints
are based on multi-layered dense CRFs trained over appearance as well as motion
cues, other proposed constraints can be expressed as additional terms in the
bundle adjustment optimization process that does iterative refinement of 3D
structure and camera / object motion trajectories. We show results on the
challenging KITTI urban dataset for accuracy of motion segmentation and
reconstruction of the trajectory and shape of moving objects relative to ground
truth. We are able to show average relative error reduction by a significant
amount for moving object trajectory reconstruction relative to state-of-the-art
methods like VISO 2, as well as standard bundle adjustment algorithms
Visual SLAM muuttuvissa ympäristöissä
This thesis investigates the problem of Visual Simultaneous Localization and Mapping (vSLAM) in
changing environments. The vSLAM problem is to sequentially estimate the pose of a device with
mounted cameras in a map generated based on images taken with those cameras. vSLAM algorithms
face two main challenges in changing environments: moving objects and temporal appearance
changes. Moving objects cause problems in pose estimation if they are mistaken for static objects.
Moving objects also cause problems for loop closure detection (LCD), which is the problem of
detecting whether a previously visited place has been revisited. A same moving object observed
in two different places may cause false loop closures to be detected. Temporal appearance changes
such as those brought about by time of day or weather changes cause long-term data association
errors for LCD. These cause difficulties in recognizing previously visited places after they have
undergone appearance changes. Focus is placed on LCD, which turns out to be the part of vSLAM
that changing environment affects the most. In addition, several techniques and algorithms for
Visual Place Recognition (VPR) in challenging conditions that could be used in the context of
LCD are surveyed and the performance of two state-of-the-art modern VPR algorithms in changing
environments is assessed in an experiment in order to measure their applicability for LCD. The
most severe performance degrading appearance changes are found to be those caused by change in
season and illumination. Several algorithms and techniques that perform well in loop closure related
tasks in specific environmental conditions are identified as a result of the survey. Finally, a limited
experiment on the Nordland dataset implies that the tested VPR algorithms are usable as is or can
be modified for use in long-term LCD. As a part of the experiment, a new simple neighborhood
consistency check was also developed, evaluated, and found to be effective at reducing false positives
output by the tested VPR algorithms
Exposing the Unseen: Exposure Time Emulation for Offline Benchmarking of Vision Algorithms
Visual Odometry (VO) is one of the fundamental tasks in computer vision for
robotics. However, its performance is deeply affected by High Dynamic Range
(HDR) scenes, omnipresent outdoor. While new Automatic-Exposure (AE) approaches
to mitigate this have appeared, their comparison in a reproducible manner is
problematic. This stems from the fact that the behavior of AE depends on the
environment, and it affects the image acquisition process. Consequently, AE has
traditionally only been benchmarked in an online manner, making the experiments
non-reproducible. To solve this, we propose a new methodology based on an
emulator that can generate images at any exposure time. It leverages BorealHDR,
a unique multi-exposure stereo dataset collected over 8.4 km, on 50
trajectories with challenging illumination conditions. Moreover, it contains
pose ground truth for each image and a global 3D map, based on lidar data. We
show that using these images acquired at different exposure times, we can
emulate realistic images keeping a Root-Mean-Square Error (RMSE) below 1.78 %
compared to ground truth images. To demonstrate the practicality of our
approach for offline benchmarking, we compared three state-of-the-art AE
algorithms on key elements of Visual Simultaneous Localization And Mapping
(VSLAM) pipeline, against four baselines. Consequently, reproducible evaluation
of AE is now possible, speeding up the development of future approaches. Our
code and dataset are available online at this link:
https://github.com/norlab-ulaval/BorealHDRComment: 6 pages, 6 figures, submitted to 2024 IEEE International Conference
on Robotics and Automation (ICRA 2024
- …