9 research outputs found

    RGBDTAM: A Cost-Effective and Accurate RGB-D Tracking and Mapping System

    Full text link
    Simultaneous Localization and Mapping using RGB-D cameras has been a fertile research topic in the latest decade, due to the suitability of such sensors for indoor robotics. In this paper we propose a direct RGB-D SLAM algorithm with state-of-the-art accuracy and robustness at a los cost. Our experiments in the RGB-D TUM dataset [34] effectively show a better accuracy and robustness in CPU real time than direct RGB-D SLAM systems that make use of the GPU. The key ingredients of our approach are mainly two. Firstly, the combination of a semi-dense photometric and dense geometric error for the pose tracking (see Figure 1), which we demonstrate to be the most accurate alternative. And secondly, a model of the multi-view constraints and their errors in the mapping and tracking threads, which adds extra information over other approaches. We release the open-source implementation of our approach 1 . The reader is referred to a video with our results 2 for a more illustrative visualization of its performance

    Simultaneous super-resolution, tracking and mapping

    Get PDF
    This paper proposes a new visual SLAM technique that not only integrates 6DOF pose and dense structure but also simultaneously integrates the color information contained in the images over time. This involves developing an inverse model for creating a super-resolution map from many low resolution images. Contrary to classic super-resolution techniques, this is achieved here by taking into account full 3D translation and rotation within a dense localisation and mapping framework. This not only allows to take into account the full range of image deformations but also allows to propose a novel criteria for combining the low resolution images together based on the difference in resolution between different images in 6D space. Several results are given showing that this technique runs in real-time (30Hz) and is able to map large scale environments in high-resolution whilst simultaneously improving the accuracy and robustness of the tracking

    Understanding Everyday Hands in Action from RGB-D Images

    Get PDF
    International audienceWe analyze functional manipulations of handheld objects, formalizing the problem as one of fine-grained grasp classification. To do so, we make use of a recently developed fine-grained taxonomy of human-object grasps. We introduce a large dataset of 12000 RGB-D images covering 71 everyday grasps in natural interactions. Our dataset is different from past work (typically addressed from a robotics perspective) in terms of its scale, diversity, and combination of RGB and depth data. From a computer-vision perspective , our dataset allows for exploration of contact and force prediction (crucial concepts in functional grasp analysis) from perceptual cues. We present extensive experimental results with state-of-the-art baselines, illustrating the role of segmentation, object context, and 3D-understanding in functional grasp analysis. We demonstrate a near 2X improvement over prior work and a naive deep baseline, while pointing out important directions for improvement

    RGB-D Odometry and SLAM

    Full text link
    The emergence of modern RGB-D sensors had a significant impact in many application fields, including robotics, augmented reality (AR) and 3D scanning. They are low-cost, low-power and low-size alternatives to traditional range sensors such as LiDAR. Moreover, unlike RGB cameras, RGB-D sensors provide the additional depth information that removes the need of frame-by-frame triangulation for 3D scene reconstruction. These merits have made them very popular in mobile robotics and AR, where it is of great interest to estimate ego-motion and 3D scene structure. Such spatial understanding can enable robots to navigate autonomously without collisions and allow users to insert virtual entities consistent with the image stream. In this chapter, we review common formulations of odometry and Simultaneous Localization and Mapping (known by its acronym SLAM) using RGB-D stream input. The two topics are closely related, as the former aims to track the incremental camera motion with respect to a local map of the scene, and the latter to jointly estimate the camera trajectory and the global map with consistency. In both cases, the standard approaches minimize a cost function using nonlinear optimization techniques. This chapter consists of three main parts: In the first part, we introduce the basic concept of odometry and SLAM and motivate the use of RGB-D sensors. We also give mathematical preliminaries relevant to most odometry and SLAM algorithms. In the second part, we detail the three main components of SLAM systems: camera pose tracking, scene mapping and loop closing. For each component, we describe different approaches proposed in the literature. In the final part, we provide a brief discussion on advanced research topics with the references to the state-of-the-art.Comment: This is the pre-submission version of the manuscript that was later edited and published as a chapter in RGB-D Image Analysis and Processin

    Egocentric real-time workspace monitoring using an RGB-D camera

    Get PDF
    Abstract—We describe an integrated system for personal workspace monitoring based around an RGB-D sensor. The approach is egocentric, facilitating full flexibility, and operates in real-time, providing object detection and recognition, and 3D trajectory estimation whilst the user undertakes tasks in the workspace. A prototype on-body system developed in the context of work-flow analysis for industrial manipulation and assembly tasks is described. The system is evaluated on two tasks with multiple users, and results indicate that the method is effective, giving good accuracy performance. I

    Reconocimiento de acciones egocéntricas desde una cámara RGB-D montada en un casco

    Get PDF
    En este proyecto se ha implementado un sistema de reconocimiento de acciones mediante el uso de una cámara que lleva el propio usuario sujeta en un casco. El objetivo es analizar la capacidad de reconocimiento de una cámara RGB-D portada por el propio usuario (es decir, según el término en inglés, una camara "wearable"). El objetivo a más largo plazo será poder reconocer las acciones de un individuo en primera persona, para su posterior análisis en diferentes aplicaciones, desde sistemas de guiado de instrucciones para realizar una tarea complicada, asistencia a discapacitados visuales o monitorización de la actividad de una persona por motivos de salud o rehabilitación. Este proyecto está incluido dentro de las líneas de investigación del grupo de Robótica, Percepción y Tiempo Real de la Universidad de Zaragoza. Las tareas realizadas en este proyecto han sido las siguientes: como parte del análisis del problema, se ha realizado un estudio de las imágenes RGB-D y de la información que proporcionan. También se han estudiado distintas opciones de segmentación en dichas imágenes para mejorar la extracción de información y de los posibles descriptores para representar y comprimir dicha información. Por último se han estudiado dos de los clasificadores más utilizados en materia de visión por computador para tareas de reconocimiento. Además se ha realizado un etiquetado de referencia con varias secuencias de imágenes para su uso en los experimentos. Como análisis de rendimiento del reconocedor, se han diseñado y realizado un conjunto de experimentos comparativos entre las distintas posibilidades de descripción y clasificación. Se ha analizado y documentado los resultados de dichos experimentos.Como análisis de rendimiento del reconocedor, se han diseñado y realizado un conjunto de experimentos comparativos entre las distintas posibilidades de descripción y clasificación. Se ha analizado y documentado los resultados de dichos experimentos. En estos experimentos, hemos probado una clasificación en distintos niveles mediante el uso de los descriptores estudiados y hemos visto que la clasificación para un nivel básico de manipulación o no manipulación funciona muy bien, pero que los descriptores estudiados no dan suficiente información como para realizar una clasificación más precisa. Además, también se ha analizado en detalle cuánto influyen los distintos descriptores y las posibles combinaciones de los mismos

    Computer Vision Algorithms for Mobile Camera Applications

    Get PDF
    Wearable and mobile sensors have found widespread use in recent years due to their ever-decreasing cost, ease of deployment and use, and ability to provide continuous monitoring as opposed to sensors installed at fixed locations. Since many smart phones are now equipped with a variety of sensors, including accelerometer, gyroscope, magnetometer, microphone and camera, it has become more feasible to develop algorithms for activity monitoring, guidance and navigation of unmanned vehicles, autonomous driving and driver assistance, by using data from one or more of these sensors. In this thesis, we focus on multiple mobile camera applications, and present lightweight algorithms suitable for embedded mobile platforms. The mobile camera scenarios presented in the thesis are: (i) activity detection and step counting from wearable cameras, (ii) door detection for indoor navigation of unmanned vehicles, and (iii) traffic sign detection from vehicle-mounted cameras. First, we present a fall detection and activity classification system developed for embedded smart camera platform CITRIC. In our system, the camera platform is worn by the subject, as opposed to static sensors installed at fixed locations in certain rooms, and, therefore, monitoring is not limited to confined areas, and extends to wherever the subject may travel including indoors and outdoors. Next, we present a real-time smart phone-based fall detection system, wherein we implement camera and accelerometer based fall-detection on Samsung Galaxy S™ 4. We fuse these two sensor modalities to have a more robust fall detection system. Then, we introduce a fall detection algorithm with autonomous thresholding using relative-entropy within the class of Ali-Silvey distance measures. As another wearable camera application, we present a footstep counting algorithm using a smart phone camera. This algorithm provides more accurate step-count compared to using only accelerometer data in smart phones and smart watches at various body locations. As a second mobile camera scenario, we study autonomous indoor navigation of unmanned vehicles. A novel approach is proposed to autonomously detect and verify doorway openings by using the Google Project Tango™ platform. The third mobile camera scenario involves vehicle-mounted cameras. More specifically, we focus on traffic sign detection from lower-resolution and noisy videos captured from vehicle-mounted cameras. We present a new method for accurate traffic sign detection, incorporating Aggregate Channel Features and Chain Code Histograms, with the goal of providing much faster training and testing, and comparable or better performance, with respect to deep neural network approaches, without requiring specialized processors. Proposed computer vision algorithms provide promising results for various useful applications despite the limited energy and processing capabilities of mobile devices

    AN INTEGRATED AUGMENTED REALITY METHOD TO ASSEMBLY SIMULATION AND GUIDANCE

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH
    corecore