1,475 research outputs found
Visual 3-D SLAM from UAVs
The aim of the paper is to present, test and discuss the implementation of Visual SLAM techniques to images taken from Unmanned Aerial Vehicles (UAVs) outdoors, in partially structured environments. Every issue of the whole process is discussed in order to obtain more accurate localization and mapping from UAVs flights. Firstly, the issues related to the visual features of objects in the scene, their distance to the UAV, and the related image acquisition system and their calibration are evaluated for improving the whole process. Other important, considered issues are related to the image processing techniques, such as interest point detection, the matching procedure and the scaling factor. The whole system has been tested using the COLIBRI mini UAV in partially structured environments. The results that have been obtained for localization, tested against the GPS information of the flights, show that Visual SLAM delivers reliable localization and mapping that makes it suitable for some outdoors applications when flying UAVs
PHROG: A Multimodal Feature for Place Recognition
International audienceLong-term place recognition in outdoor environments remains a challenge due to high appearance changes in the environment. The problem becomes even more difficult when the matching between two scenes has to be made with information coming from different visual sources, particularly with different spectral ranges. For instance, an infrared camera is helpful for night vision in combination with a visible camera. In this paper, we emphasize our work on testing usual feature point extractors under both constraints: repeatability across spectral ranges and long-term appearance. We develop a new feature extraction method dedicated to improve the repeatability across spectral ranges. We conduct an evaluation of feature robustness on long-term datasets coming from different imaging sources (optics, sensors size and spectral ranges) with a Bag-of-Words approach. The tests we perform demonstrate that our method brings a significant improvement on the image retrieval issue in a visual place recognition context, particularly when there is a need to associate images from various spectral ranges such as infrared and visible: we have evaluated our approach using visible, Near InfraRed (NIR), Short Wavelength InfraRed (SWIR) and Long Wavelength InfraRed (LWIR)
IMPACT ASSESSMENT OF IMAGE FEATURE EXTRACTORS ON THE PERFORMANCE OF SLAM SYSTEMS
This work evaluates an impact of image feature extractors on the performance of a visual SLAM method in terms of pose accuracy and computational requirements. In particular, the S-PTAM (Stereo Parallel Tracking and Mapping) method is considered as the visual SLAM framework for which both the feature detector and feature descriptor are parametrized. The evaluation was performed with a standard dataset with ground-truth information and six feature detectors and four descriptors. The presented results indicate that the combination of the GFTT detector and the BRIEF descriptor provides the best trade-off between the localization precision and computational requirements among the evaluated combinations of the detectors and descriptors
InLoc: Indoor Visual Localization with Dense Matching and View Synthesis
We seek to predict the 6 degree-of-freedom (6DoF) pose of a query photograph
with respect to a large indoor 3D map. The contributions of this work are
three-fold. First, we develop a new large-scale visual localization method
targeted for indoor environments. The method proceeds along three steps: (i)
efficient retrieval of candidate poses that ensures scalability to large-scale
environments, (ii) pose estimation using dense matching rather than local
features to deal with textureless indoor scenes, and (iii) pose verification by
virtual view synthesis to cope with significant changes in viewpoint, scene
layout, and occluders. Second, we collect a new dataset with reference 6DoF
poses for large-scale indoor localization. Query photographs are captured by
mobile phones at a different time than the reference 3D map, thus presenting a
realistic indoor localization scenario. Third, we demonstrate that our method
significantly outperforms current state-of-the-art indoor localization
approaches on this new challenging data
Image features for visual teach-and-repeat navigation in changing environments
We present an evaluation of standard image features in the context of long-term visual teach-and-repeat navigation of mobile robots, where the environment exhibits significant changes in appearance caused by seasonal weather variations and daily illumination changes. We argue that for long-term autonomous navigation, the viewpoint-, scale- and rotation- invariance of the standard feature extractors is less important than their robustness to the mid- and long-term environment appearance changes. Therefore, we focus our evaluation on the robustness of image registration to variable lighting and naturally-occurring seasonal changes. We combine detection and description components of different image extractors and evaluate their performance on five datasets collected by mobile vehicles in three different outdoor environments over the course of one year. Moreover, we propose a trainable feature descriptor based on a combination of evolutionary algorithms and Binary Robust Independent Elementary Features, which we call GRIEF (Generated BRIEF). In terms of robustness to seasonal changes, the most promising results were achieved by the SpG/CNN and the STAR/GRIEF feature, which was slightly less robust, but faster to calculate
Image features and seasons revisited
We present an evaluation of standard image features in the context of long-term visual teach-and-repeat mobile robot navigation, where the environment exhibits significant changes in appearance caused by seasonal weather variations and daily illumination changes. We argue that in the given long-term scenario, the viewpoint, scale and rotation invariance of the standard feature extractors is less important than their robustness to the mid- and long-term environment appearance changes. Therefore, we focus our evaluation on the robustness of image registration to variable lighting and naturally-occurring seasonal changes. We evaluate the image feature extractors on three datasets collected by mobile robots in two different outdoor environments over the course of one year. Based on this analysis, we propose a novel feature descriptor based on a combination of evolutionary algorithms and Binary Robust Independent Elementary Features, which we call GRIEF (Generated BRIEF). In terms of robustness to seasonal changes, the GRIEF feature descriptor outperforms the other ones while being computationally more efficient
LiDAR-Generated Images Derived Keypoints Assisted Point Cloud Registration Scheme in Odometry Estimation
Keypoint detection and description play a pivotal role in various robotics
and autonomous applications including visual odometry (VO), visual navigation,
and Simultaneous localization and mapping (SLAM). While a myriad of keypoint
detectors and descriptors have been extensively studied in conventional camera
images, the effectiveness of these techniques in the context of LiDAR-generated
images, i.e. reflectivity and ranges images, has not been assessed. These
images have gained attention due to their resilience in adverse conditions such
as rain or fog. Additionally, they contain significant textural information
that supplements the geometric information provided by LiDAR point clouds in
the point cloud registration phase, especially when reliant solely on LiDAR
sensors. This addresses the challenge of drift encountered in LiDAR Odometry
(LO) within geometrically identical scenarios or where not all the raw point
cloud is informative and may even be misleading. This paper aims to analyze the
applicability of conventional image key point extractors and descriptors on
LiDAR-generated images via a comprehensive quantitative investigation.
Moreover, we propose a novel approach to enhance the robustness and reliability
of LO. After extracting key points, we proceed to downsample the point cloud,
subsequently integrating it into the point cloud registration phase for the
purpose of odometry estimation. Our experiment demonstrates that the proposed
approach has comparable accuracy but reduced computational overhead, higher
odometry publishing rate, and even superior performance in scenarios prone to
drift by using the raw point cloud. This, in turn, lays a foundation for
subsequent investigations into the integration of LiDAR-generated images with
LO. Our code is available on GitHub:
https://github.com/TIERS/ws-lidar-as-camera-odom
Outdoor view recognition based on landmark grouping and logistic regression
Vision-based robot localization outdoors has remained more elusive than its indoors counterpart. Drastic illumination changes and the scarceness of suitable landmarks are the main difficulties. This paper attempts to surmount them by deviating from the main trend of using local features. Instead, a global descriptor called landmark-view is defined, which aggregates the most visually-salient landmarks present in each scene. Thus, landmark co-occurrence and spatial and saliency relationships between them are added to the single landmark characterization, based on saliency and color distribution. A suitable framework to compare landmark-views is developed, and it is shown how this remarkably enhances the recognition performance, compared against single landmark recognition. A view-matching model is constructed using logistic regression. Experimentation using 45 views, acquired outdoors, containing 273 landmarks, yielded good recognition results. The overall percentage of correct view classification obtained was 80.6%, indicating the adequacy of the approach.Peer ReviewedPostprint (author’s final draft
LCrowdV: Generating Labeled Videos for Simulation-based Crowd Behavior Learning
We present a novel procedural framework to generate an arbitrary number of
labeled crowd videos (LCrowdV). The resulting crowd video datasets are used to
design accurate algorithms or training models for crowded scene understanding.
Our overall approach is composed of two components: a procedural simulation
framework for generating crowd movements and behaviors, and a procedural
rendering framework to generate different videos or images. Each video or image
is automatically labeled based on the environment, number of pedestrians,
density, behavior, flow, lighting conditions, viewpoint, noise, etc.
Furthermore, we can increase the realism by combining synthetically-generated
behaviors with real-world background videos. We demonstrate the benefits of
LCrowdV over prior lableled crowd datasets by improving the accuracy of
pedestrian detection and crowd behavior classification algorithms. LCrowdV
would be released on the WWW
- …