788 research outputs found
Semantic localization in the PCL library
The semantic localization problem in robotics consists in determining the place where a robot is located by means of semantic categories. The problem is usually addressed as a supervised classification process, where input data correspond to robot perceptions while classes to semantic categories, like kitchen or corridor. In this paper we propose a framework, implemented in the PCL library, which provides a set of valuable tools to easily develop and evaluate semantic localization systems. The implementation includes the generation of 3D global descriptors following a Bag-of-Words approach. This allows the generation of fixed-dimensionality descriptors from any type of keypoint detector and feature extractor combinations. The framework has been designed, structured and implemented to be easily extended with different keypoint detectors, feature extractors as well as classification models. The proposed framework has also been used to evaluate the performance of a set of already implemented descriptors, when used as input for a specific semantic localization system. The obtained results are discussed paying special attention to the internal parameters of the BoW descriptor generation process. Moreover, we also review the combination of some keypoint detectors with different 3D descriptor generation techniques.This work was supported by grant DPI2013-40534-R of the Ministerio de Economia y Competitividad of the Spanish Government, supported with Feder funds, and by ConsejerĂa de EducaciĂłn, Cultura y Deportes of the JCCM regional government through project PPII-2014-015-P. Jesus MartĂnez-GĂłmez was also funded by the JCCM grant POST2014/8171
OLT: A Toolkit for Object Labeling Applied to Robotic RGB-D Datasets
In this work we present the Object Labeling Toolkit
(OLT), a set of software components publicly available for
helping in the management and labeling of sequential RGB-D
observations collected by a mobile robot. Such a robot can be
equipped with an arbitrary number of RGB-D devices, possibly
integrating other sensors (e.g. odometry, 2D laser scanners,
etc.). OLT first merges the robot observations to generate a
3D reconstruction of the scene from which object segmentation
and labeling is conveniently accomplished. The annotated labels
are automatically propagated by the toolkit to each RGB-D
observation in the collected sequence, providing a dense labeling
of both intensity and depth images. The resulting objects’ labels
can be exploited for many robotic oriented applications, including
high-level decision making, semantic mapping, or contextual
object recognition. Software components within OLT are highly
customizable and expandable, facilitating the integration of
already-developed algorithms. To illustrate the toolkit suitability,
we describe its application to robotic RGB-D sequences taken in
a home environment.Universidad de Málaga. Campus de Excelencia Internacional AndalucĂa Tech. Spanish grant pro-
gram FPU-MICINN 2010 and the Spanish projects TAROTH:
New developments toward a Robot at Home (DPI2011-25483)
and PROMOVE: Advances in mobile robotics for promoting
independent life of elders (DPI2014-55826-R
RGB-D datasets using microsoft kinect or similar sensors: a survey
RGB-D data has turned out to be a very useful representation of an indoor scene for solving fundamental computer vision problems. It takes the advantages of the color image that provides appearance information of an object and also the depth image that is immune to the variations in color, illumination, rotation angle and scale. With the invention of the low-cost Microsoft Kinect sensor, which was initially used for gaming and later became a popular device for computer vision, high quality RGB-D data can be acquired easily. In recent years, more and more RGB-D image/video datasets dedicated to various applications have become available, which are of great importance to benchmark the state-of-the-art. In this paper, we systematically survey popular RGB-D datasets for different applications including object recognition, scene classification, hand gesture recognition, 3D-simultaneous localization and mapping, and pose estimation. We provide the insights into the characteristics of each important dataset, and compare the popularity and the difficulty of those datasets. Overall, the main goal of this survey is to give a comprehensive description about the available RGB-D datasets and thus to guide researchers in the selection of suitable datasets for evaluating their algorithms
DeepICP: An End-to-End Deep Neural Network for 3D Point Cloud Registration
We present DeepICP - a novel end-to-end learning-based 3D point cloud
registration framework that achieves comparable registration accuracy to prior
state-of-the-art geometric methods. Different from other keypoint based methods
where a RANSAC procedure is usually needed, we implement the use of various
deep neural network structures to establish an end-to-end trainable network.
Our keypoint detector is trained through this end-to-end structure and enables
the system to avoid the inference of dynamic objects, leverages the help of
sufficiently salient features on stationary objects, and as a result, achieves
high robustness. Rather than searching the corresponding points among existing
points, the key contribution is that we innovatively generate them based on
learned matching probabilities among a group of candidates, which can boost the
registration accuracy. Our loss function incorporates both the local similarity
and the global geometric constraints to ensure all above network designs can
converge towards the right direction. We comprehensively validate the
effectiveness of our approach using both the KITTI dataset and the
Apollo-SouthBay dataset. Results demonstrate that our method achieves
comparable or better performance than the state-of-the-art geometry-based
methods. Detailed ablation and visualization analysis are included to further
illustrate the behavior and insights of our network. The low registration error
and high robustness of our method makes it attractive for substantial
applications relying on the point cloud registration task.Comment: 10 pages, 6 figures, 3 tables, typos corrected, experimental results
updated, accepted by ICCV 201
- …