239 research outputs found

    Active SLAM: A Review On Last Decade

    Full text link
    This article presents a comprehensive review of the Active Simultaneous Localization and Mapping (A-SLAM) research conducted over the past decade. It explores the formulation, applications, and methodologies employed in A-SLAM, particularly in trajectory generation and control-action selection, drawing on concepts from Information Theory (IT) and the Theory of Optimal Experimental Design (TOED). This review includes both qualitative and quantitative analyses of various approaches, deployment scenarios, configurations, path-planning methods, and utility functions within A-SLAM research. Furthermore, this article introduces a novel analysis of Active Collaborative SLAM (AC-SLAM), focusing on collaborative aspects within SLAM systems. It includes a thorough examination of collaborative parameters and approaches, supported by both qualitative and statistical assessments. This study also identifies limitations in the existing literature and suggests potential avenues for future research. This survey serves as a valuable resource for researchers seeking insights into A-SLAM methods and techniques, offering a current overview of A-SLAM formulation.Comment: 34 pages, 8 figures, 6 table

    Unsupervised Visual Odometry and Action Integration for PointGoal Navigation in Indoor Environment

    Full text link
    PointGoal navigation in indoor environment is a fundamental task for personal robots to navigate to a specified point. Recent studies solved this PointGoal navigation task with near-perfect success rate in photo-realistically simulated environments, under the assumptions with noiseless actuation and most importantly, perfect localization with GPS and compass sensors. However, accurate GPS signalis difficult to be obtained in real indoor environment. To improve the PointGoal navigation accuracy without GPS signal, we use visual odometry (VO) and propose a novel action integration module (AIM) trained in unsupervised manner. Sepecifically, unsupervised VO computes the relative pose of the agent from the re-projection error of two adjacent frames, and then replaces the accurate GPS signal with the path integration. The pseudo position estimated by VO is used to train action integration which assists agent to update their internal perception of location and helps improve the success rate of navigation. The training and inference process only use RGB, depth, collision as well as self-action information. The experiments show that the proposed system achieves satisfactory results and outperforms the partially supervised learning algorithms on the popular Gibson dataset.Comment: 12 pages, 6 figure

    Beyond sight : an approach for visual semantic navigation of mobile robots in an indoor environment

    Get PDF
    Orientador: Eduardo TodtDissertação (mestrado) - Universidade Federal do Paraná, Setor de Ciências Exatas, Programa de Pós-Graduação em Informática. Defesa : Curitiba, 22/02/2021Inclui referências: p. 134-146Área de concentração: Ciência da ComputaçãoResumo: Com o crescimento da automacao, os veiculos nao tripulados tornaram-se um tema de destaque, tanto como produtos comerciais quanto como um topico de pesquisa cientifica. Compoem um campo multidisciplinar de robotica que abrange sistemas embarcados, teoria de controle, planejamento de caminhos, localizacao e mapeamento simultaneos (SLAM), reconstrucao de cenas e reconhecimento de padroes. Apresentamos neste trabalho uma pesquisa exploratoria de como a fusao dos dados de sensores e algoritmos de aprendizagem de maquinas, que compoem o estado da arte, podem realizar a tarefa chamada Navegacao Visual Semantica que e uma navegacao autonoma utilizando observacoes visuais egocentricas para alcancar um objeto pertencente a classe semantica alvo sem conhecimento previo do ambiente. Para realizar experimentos, propomos uma encarnacao chamada VRIBot. O robo foi modelado de tal forma que pode ser facilmente simulado, e os experimentos sao reproduziveis sem a necessidade do robo fisico. Tres diferentes pipelines EXchangeable, AUTOcrat e BEyond foram propostos e avaliados. Nossa abordagem chamada BEyond alcancou a 5a posicao entre 12 no conjunto val_mini do Habitat-Challenge 2020 ObjectNav quando comparada a outros resultados relatados na tabela classificativa da competicao. O resultado da pesquisa mostra que a fusao de dados em conjunto com algoritmos de aprendizado de maquina sao uma abordagem promissora para o problema de navegacao semantica. Palavras-chave: Navegacao-visual-semantica. SLAM. Aprendizado-profundo. Navegacao- Autonoma. Segmentacao-semantica.Abstract: With the rise of automation, unmanned vehicles became a hot topic both as commercial products and as a scientific research topic. It composes a multi-disciplinary field of robotics that encompasses embedded systems, control theory, path planning, Simultaneous Localization and Mapping (SLAM), scene reconstruction, and pattern recognition. In this work, we present our exploratory research of how sensor data fusion and state-of-the-art machine learning algorithms can perform the Embodied Artificial Intelligence (E-AI) task called Visual Semantic Navigation, a.k.a Object-Goal Navigation (ObjectNav) that is an autonomous navigation using egocentric visual observations to reach an object belonging to the target semantic class without prior knowledge of the environment. To perform experimentation, we propose an embodiment named VRIBot. The robot was modeled in such a way that it can be easily simulated, and the experiments are reproducible without the need for the physical robot. Three different pipelines EXchangeable, AUTOcrat, and BEyond, were proposed and evaluated. Our approach, named BEyond, reached 5th rank out of 12 on the val_mini set of the Habitat-Challenge 2020 ObjectNav when compared to other reported results on the competition's leaderboard. Our results show that data fusion combined with machine learning algorithms are a promising approach to the semantic navigation problem. Keywords: Visual-semantic-navigation. Deep-Learning. SLAM. Autonomous-navigation. Semantic-segmentation

    Information-driven navigation

    Get PDF
    En los últimos años, hemos presenciado un progreso enorme de la precisión y la robustez de la “Odometría Visual” (VO) y del “Mapeo y la Localización Simultánea” (SLAM). Esta mejora de su funcionamiento ha permitido las primeras implementaciones comerciales relacionadascon la realidad aumentada (AR), la realidad virtual (VR) y la robótica. En esta tesis, desarrollamos nuevos métodos probabilísticos para mejorar la precisión, robustez y eficiencia de estas técnicas. Las contribuciones de nuestro trabajo están publicadas en tres artículos y se complementan con el lanzamiento de “SID-SLAM”, el software que contiene todas nuestras contribuciones, y del “Minimal Texture dataset”.Nuestra primera contribución es un algoritmo para la selección de puntos basado en Teoría de la Información para sistemas RGB-D VO/SLAM basados en métodos directos y/o en características visuales (features). El objetivo es seleccionar las medidas más informativas, para reducir el tama˜no del problema de optimización con un impacto mínimo en la precisión. Nuestros resultados muestran que nuestro nuevo criterio permitereducir el número de puntos hasta tan sólo 24 de ellos, alcanzando la precisión del estado del arte y reduciendo en hasta 10 veces la demanda computacional.El desarrollo de mejores modelos de incertidumbre para las medidas visuales mejoraría la precisión de la estructura y movimiento multi-vista y llevaría a estimaciones más realistas de la incertidumbre del estado en VO/SLAM. En esta tesis derivamos un modelo de covarianza para residuos multi-vista, que se convierte en un elemento crucial de nuestras contribuciones basadas en Teoría de la Información.La odometría visual y los sistemas de SLAM se dividen típicamente en la literatura en dos categorías, los basados en features y los métodos directos, dependiendo del tipo de residuos que son minimizados. En la última parte de la tesis combinamos nuestras dos contribucionesanteriores en la formulación e implementación de SID-SLAM, el primer sistema completo de SLAM semi-directo RGB-D que utiliza de forma integrada e indistinta features y métodos directos, en un sistema completo dirigido con información. Adicionalmente, grabamos ‘‘Minimal Texture”, un dataset RGB-D con un contenido visual conceptualmente simple pero arduo, con un ground truth preciso para facilitar la investigación del estado del arte en SLAM semi-directo.In the last years, we have witnessed an impressive progress in the accuracy and robustness of Visual Odometry (VO) and Simultaneous Localization and Mapping (SLAM). This boost in the performance has enabled the first commercial implementations related to augmented reality (AR), virtual reality (VR) and robotics. In this thesis, we developed new probabilistic methods to further improve the accuracy, robustness and efficiency of VO and SLAM. The contributions of our work are issued in three main publications and complemented with the release of SID-SLAM, the software containing all our contributions, and the challenging Mininal Texture dataset. Our first contribution is an information-theoretic approach to point selection for direct and/or feature-based RGB-D VO/SLAM. The aim is to select only the most informative measurements, in order to reduce the optimization problem with a minimal impact in the accuracy. Our experimental results show that our novel criteria allows us to reduce the number of tracked points down to only 24 of them, achieving state-of-the-art accuracy while reducing 10x the computational demand. Better uncertainty models for visual measurements will impact the accuracy of multi-view structure and motion and will lead to realistic uncertainty estimates of the VO/SLAM states. We derived a novel model for multi-view residual covariances based on perspective deformation, which has become a crucial element in our information-driven approach. Visual odometry and SLAM systems are typically divided in the literature into two categories, feature-based and direct methods, depending on the type of residuals that are minimized. We combined our two previous contributions in the formulation and implementation of SID-SLAM, the first full semi-direct RGB-D SLAM system that uses tightly and indistinctly features and direct methods within a complete information-driven pipeline. Moreover, we recorded Minimal Texture an RGB-D dataset with conceptually simple but challenging content, with accurate ground truth to facilitate state-of-the-art research on semi-direct SLAM.<br /

    DOT: Dynamic Object Tracking for Visual SLAM

    Get PDF
    In this paper we present DOT (Dynamic Object Tracking), a front-end that added to existing SLAM systems can significantly improve their robustness and accuracy in highly dynamic environments. DOT combines instance segmentation and multi-view geometry to generate masks for dynamic objects in order to allow SLAM systems based on rigid scene models to avoid such image areas in their optimizations. To determine which objects are actually moving, DOT segments first instances of potentially dynamic objects and then, with the estimated camera motion, tracks such objects by minimizing the photometric reprojection error. This short-term tracking improves the accuracy of the segmentation with respect to other approaches. In the end, only actually dynamic masks are generated. We have evaluated DOT with ORB-SLAM 2 in three public datasets. Our results show that our approach improves significantly the accuracy and robustness of ORB-SLAM 2, especially in highly dynamic scenes

    Probabilistic Global Scale Estimation for MonoSLAM Based on Generic Object Detection

    Full text link
    This paper proposes a novel method to estimate the global scale of a 3D reconstructed model within a Kalman filtering-based monocular SLAM algorithm. Our Bayesian framework integrates height priors over the detected objects belonging to a set of broad predefined classes, based on recent advances in fast generic object detection. Each observation is produced on single frames, so that we do not need a data association process along video frames. This is because we associate the height priors with the image region sizes at image places where map features projections fall within the object detection regions. We present very promising results of this approach obtained on several experiments with different object classes.Comment: Int. Workshop on Visual Odometry, CVPR, (July 2017

    Real-time computation of distance to dynamic obstacles with multiple depth sensors

    Get PDF
    We present an efficient method to evaluate distances between dynamic obstacles and a number of points of interests (e.g., placed on the links of a robot) when using multiple depth cameras. A depth-space oriented discretization of the Cartesian space is introduced that represents at best the workspace monitored by a depth camera, including occluded points. A depth grid map can be initialized off line from the arrangement of the multiple depth cameras, and its peculiar search characteristics allows fusing on line the information given by the multiple sensors in a very simple and fast way. The real-time performance of the proposed approach is shown by means of collision avoidance experiments where two Kinect sensors monitor a human-robot coexistence task
    corecore