4,759 research outputs found

    Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age

    Get PDF
    Simultaneous Localization and Mapping (SLAM)consists in the concurrent construction of a model of the environment (the map), and the estimation of the state of the robot moving within it. The SLAM community has made astonishing progress over the last 30 years, enabling large-scale real-world applications, and witnessing a steady transition of this technology to industry. We survey the current state of SLAM. We start by presenting what is now the de-facto standard formulation for SLAM. We then review related work, covering a broad set of topics including robustness and scalability in long-term mapping, metric and semantic representations for mapping, theoretical performance guarantees, active SLAM and exploration, and other new frontiers. This paper simultaneously serves as a position paper and tutorial to those who are users of SLAM. By looking at the published research with a critical eye, we delineate open challenges and new research issues, that still deserve careful scientific investigation. The paper also contains the authors' take on two questions that often animate discussions during robotics conferences: Do robots need SLAM? and Is SLAM solved

    Data-Driven Shape Analysis and Processing

    Full text link
    Data-driven methods play an increasingly important role in discovering geometric, structural, and semantic relationships between 3D shapes in collections, and applying this analysis to support intelligent modeling, editing, and visualization of geometric data. In contrast to traditional approaches, a key feature of data-driven approaches is that they aggregate information from a collection of shapes to improve the analysis and processing of individual shapes. In addition, they are able to learn models that reason about properties and relationships of shapes without relying on hard-coded rules or explicitly programmed instructions. We provide an overview of the main concepts and components of these techniques, and discuss their application to shape classification, segmentation, matching, reconstruction, modeling and exploration, as well as scene analysis and synthesis, through reviewing the literature and relating the existing works with both qualitative and numerical comparisons. We conclude our report with ideas that can inspire future research in data-driven shape analysis and processing.Comment: 10 pages, 19 figure

    A factorization approach to inertial affine structure from motion

    Full text link
    We consider the problem of reconstructing a 3-D scene from a moving camera with high frame rate using the affine projection model. This problem is traditionally known as Affine Structure from Motion (Affine SfM), and can be solved using an elegant low-rank factorization formulation. In this paper, we assume that an accelerometer and gyro are rigidly mounted with the camera, so that synchronized linear acceleration and angular velocity measurements are available together with the image measurements. We extend the standard Affine SfM algorithm to integrate these measurements through the use of image derivatives

    A factorization approach to inertial affine structure from motion

    Full text link
    We consider the problem of reconstructing a 3-D scene from a moving camera with high frame rate using the affine projection model. This problem is traditionally known as Affine Structure from Motion (Affine SfM), and can be solved using an elegant low-rank factorization formulation. In this paper, we assume that an accelerometer and gyro are rigidly mounted with the camera, so that synchronized linear acceleration and angular velocity measurements are available together with the image measurements. We extend the standard Affine SfM algorithm to integrate these measurements through the use of image derivatives

    Grasping unknown objects in clutter by superquadric representation

    Get PDF
    © 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.In this paper, a quick and efficient method is presented for grasping unknown objects in clutter. The grasping method relies on real-time superquadric (SQ) representation of partial view objects and incomplete object modelling, well suited for unknown symmetric objects in cluttered scenarios which is followed by optimized antipodal grasping. The incomplete object models are processed through a mirroring algorithm that assumes symmetry to first create an approximate complete model and then fit for SQ representation. The grasping algorithm is designed for maximum force balance and stability, taking advantage of the quick retrieval of dimension and surface curvature information from the SQ parameters. The pose of the SQs with respect to the direction of gravity is calculated and used together with the parameters of the SQs and specification of the gripper, to select the best direction of approach and contact points. The SQ fitting method has been tested on custom datasets containing objects in isolation as well as in clutter. The grasping algorithm is evaluated on a PR2 robot and real time results are presented. Initial results indicate that though the method is based on simplistic shape information, it outperforms other learning based grasping algorithms that also work in clutter in terms of time-efficiency and accuracy.Peer ReviewedPostprint (author's final draft

    Deformable and articulated 3D reconstruction from monocular video sequences

    Get PDF
    PhDThis thesis addresses the problem of deformable and articulated structure from motion from monocular uncalibrated video sequences. Structure from motion is defined as the problem of recovering information about the 3D structure of scenes imaged by a camera in a video sequence. Our study aims at the challenging problem of non-rigid shapes (e.g. a beating heart or a smiling face). Non-rigid structures appear constantly in our everyday life, think of a bicep curling, a torso twisting or a smiling face. Our research seeks a general method to perform 3D shape recovery purely from data, without having to rely on a pre-computed model or training data. Open problems in the field are the difficulty of the non-linear estimation, the lack of a real-time system, large amounts of missing data in real-world video sequences, measurement noise and strong deformations. Solving these problems would take us far beyond the current state of the art in non-rigid structure from motion. This dissertation presents our contributions in the field of non-rigid structure from motion, detailing a novel algorithm that enforces the exact metric structure of the problem at each step of the minimisation by projecting the motion matrices onto the correct deformable or articulated metric motion manifolds respectively. An important advantage of this new algorithm is its ability to handle missing data which becomes crucial when dealing with real video sequences. We present a generic bilinear estimation framework, which improves convergence and makes use of the manifold constraints. Finally, we demonstrate a sequential, frame-by-frame estimation algorithm, which provides a 3D model and camera parameters for each video frame, while simultaneously building a model of object deformation

    RGB-D datasets using microsoft kinect or similar sensors: a survey

    Get PDF
    RGB-D data has turned out to be a very useful representation of an indoor scene for solving fundamental computer vision problems. It takes the advantages of the color image that provides appearance information of an object and also the depth image that is immune to the variations in color, illumination, rotation angle and scale. With the invention of the low-cost Microsoft Kinect sensor, which was initially used for gaming and later became a popular device for computer vision, high quality RGB-D data can be acquired easily. In recent years, more and more RGB-D image/video datasets dedicated to various applications have become available, which are of great importance to benchmark the state-of-the-art. In this paper, we systematically survey popular RGB-D datasets for different applications including object recognition, scene classification, hand gesture recognition, 3D-simultaneous localization and mapping, and pose estimation. We provide the insights into the characteristics of each important dataset, and compare the popularity and the difficulty of those datasets. Overall, the main goal of this survey is to give a comprehensive description about the available RGB-D datasets and thus to guide researchers in the selection of suitable datasets for evaluating their algorithms

    Monocular slam for deformable scenarios.

    Get PDF
    El problema de localizar la posición de un sensor en un mapa incierto que se estima simultáneamente se conoce como Localización y Mapeo Simultáneo --SLAM--. Es un problema desafiante comparable al paradigma del huevo y la gallina. Para ubicar el sensor necesitamos conocer el mapa, pero para construir el mapa, necesitamos la posición del sensor. Cuando se utiliza un sensor visual, por ejemplo, una cámara, se denomina Visual SLAM o VSLAM. Los sensores visuales para SLAM se dividen entre los que proporcionan información de profundidad (por ejemplo, cámaras RGB-D o equipos estéreo) y los que no (por ejemplo, cámaras monoculares o cámaras de eventos). En esta tesis hemos centrado nuestra investigación en SLAM con cámaras monoculares.Debido a la falta de percepción de profundidad, el SLAM monocular es intrínsecamente más duro en comparación con el SLAM con sensores de profundidad. Los trabajos estado del arte en VSLAM monocular han asumido normalmente que la escena permanece rígida durante toda la secuencia, lo que es una suposición factible para entornos industriales y urbanos. El supuesto de rigidez aporta las restricciones suficientes al problema y permite reconstruir un mapa fiable tras procesar varias imágenes. En los últimos años, el interés por el SLAM ha llegado a las áreas médicas donde los algoritmos SLAM podrían ayudar a orientar al cirujano o localizar la posición de un robot. Sin embargo, a diferencia de los escenarios industriales o urbanos, en secuencias dentro del cuerpo, todo puede deformarse eventualmente y la suposición de rigidez acaba siendo inválida en la práctica, y por extensión, también los algoritmos de SLAM monoculares. Por lo tanto, nuestro objetivo es ampliar los límites de los algoritmos de SLAM y concebir el primer sistema SLAM monocular capaz de hacer frente a la deformación de la escena.Los sistemas de SLAM actuales calculan la posición de la cámara y la estructura del mapa en dos subprocesos concurrentes: la localización y el mapeo. La localización se encarga de procesar cada imagen para ubicar el sensor de forma continua, en cambio el mapeo se encarga de construir el mapa de la escena. Nosotros hemos adoptado esta estructura y concebimos tanto la localización deformable como el mapeo deformable ahora capaces de recuperar la escena incluso con deformación.Nuestra primera contribución es la localización deformable. La localización deformable utiliza la estructura del mapa para recuperar la pose de la cámara con una única imagen. Simultáneamente, a medida que el mapa se deforma durante la secuencia, también recupera la deformación del mapa para cada fotograma. Hemos propuesto dos familias de localización deformable. En el primer algoritmo de localización deformable, asumimos que todos los puntos están embebidos en una superficie denominada plantilla. Podemos recuperar la deformación de la superficie gracias a un modelo de deformación global que permite estimar la deformación más probable del objeto. Con nuestro segundo algoritmo de localización deformable, demostramos que es posible recuperar la deformación del mapa sin un modelo de deformación global, representando el mapa como surfels individuales. Nuestros resultados experimentales mostraron que, recuperando la deformación del mapa, ambos métodos superan tanto en robustez como en precisión a los métodos rígidos.Nuestra segunda contribución es la concepción del mapeo deformable. Es el back-end del algoritmo SLAM y procesa un lote de imágenes para recuperar la estructura del mapa para todas las imágenes y hacer crecer el mapa ensamblando las observaciones parciales del mismo. Tanto la localización deformable como el mapeo que se ejecutan en paralelo y juntos ensamblan el primer SLAM monocular deformable: \emph{DefSLAM}. Una evaluación ampliada de nuestro método demostró, tanto en secuencias controladas por laboratorio como en secuencias médicas, que nuestro método procesa con éxito secuencias en las que falla el sistema monocular SLAM actual.Nuestra tercera contribución son dos métodos para explotar la información fotométrica en SLAM monocular deformable. Por un lado, SD-DefSLAM que aprovecha el emparejamiento semi-directo para obtener un emparejamiento mucho más fiable de los puntos del mapa en las nuevas imágenes, como consecuencia, se demostró que es más robusto y estable en secuencias médicas. Por otro lado, proponemos un método de Localización Deformable Directa y Dispersa en el que usamos un error fotométrico directo para rastrear la deformación de un mapa modelado como un conjunto de surfels 3D desconectados. Podemos recuperar la deformación de múltiples superficies desconectadas, deformaciones no isométricas o superficies con una topología cambiante.<br /
    corecore