36 research outputs found

    Semantic 3D Occupancy Mapping through Efficient High Order CRFs

    Full text link
    Semantic 3D mapping can be used for many applications such as robot navigation and virtual interaction. In recent years, there has been great progress in semantic segmentation and geometric 3D mapping. However, it is still challenging to combine these two tasks for accurate and large-scale semantic mapping from images. In the paper, we propose an incremental and (near) real-time semantic mapping system. A 3D scrolling occupancy grid map is built to represent the world, which is memory and computationally efficient and bounded for large scale environments. We utilize the CNN segmentation as prior prediction and further optimize 3D grid labels through a novel CRF model. Superpixels are utilized to enforce smoothness and form robust P N high order potential. An efficient mean field inference is developed for the graph optimization. We evaluate our system on the KITTI dataset and improve the segmentation accuracy by 10% over existing systems.Comment: IROS 201

    Enhancing 3D Autonomous Navigation Through Obstacle Fields: Homogeneous Localisation and Mapping, with Obstacle-Aware Trajectory Optimisation

    Get PDF
    Small flying robots have numerous potential applications, from quadrotors for search and rescue, infrastructure inspection and package delivery to free-flying satellites for assistance activities inside a space station. To enable these applications, a key challenge is autonomous navigation in 3D, near obstacles on a power, mass and computation constrained platform. This challenge requires a robot to perform localisation, mapping, dynamics-aware trajectory planning and control. The current state-of-the-art uses separate algorithms for each component. Here, the aim is for a more homogeneous approach in the search for improved efficiencies and capabilities. First, an algorithm is described to perform Simultaneous Localisation And Mapping (SLAM) with physical, 3D map representation that can also be used to represent obstacles for trajectory planning: Non-Uniform Rational B-Spline (NURBS) surfaces. Termed NURBSLAM, this algorithm is shown to combine the typically separate tasks of localisation and obstacle mapping. Second, a trajectory optimisation algorithm is presented that produces dynamically-optimal trajectories with direct consideration of obstacles, providing a middle ground between path planners and trajectory smoothers. Called the Admissible Subspace TRajectory Optimiser (ASTRO), the algorithm can produce trajectories that are easier to track than the state-of-the-art for flight near obstacles, as shown in flight tests with quadrotors. For quadrotors to track trajectories, a critical component is the differential flatness transformation that links position and attitude controllers. Existing singularities in this transformation are analysed, solutions are proposed and are then demonstrated in flight tests. Finally, a combined system of NURBSLAM and ASTRO are brought together and tested against the state-of-the-art in a novel simulation environment to prove the concept that a single 3D representation can be used for localisation, mapping, and planning

    Semantic Mapping of Road Scenes

    Get PDF
    The problem of understanding road scenes has been on the fore-front in the computer vision community for the last couple of years. This enables autonomous systems to navigate and understand the surroundings in which it operates. It involves reconstructing the scene and estimating the objects present in it, such as ‘vehicles’, ‘road’, ‘pavements’ and ‘buildings’. This thesis focusses on these aspects and proposes solutions to address them. First, we propose a solution to generate a dense semantic map from multiple street-level images. This map can be imagined as the bird’s eye view of the region with associated semantic labels for ten’s of kilometres of street level data. We generate the overhead semantic view from street level images. This is in contrast to existing approaches using satellite/overhead imagery for classification of urban region, allowing us to produce a detailed semantic map for a large scale urban area. Then we describe a method to perform large scale dense 3D reconstruction of road scenes with associated semantic labels. Our method fuses the depth-maps in an online fashion, generated from the stereo pairs across time into a global 3D volume, in order to accommodate arbitrarily long image sequences. The object class labels estimated from the street level stereo image sequence are used to annotate the reconstructed volume. Then we exploit the scene structure in object class labelling by performing inference over the meshed representation of the scene. By performing labelling over the mesh we solve two issues: Firstly, images often have redundant information with multiple images describing the same scene. Solving these images separately is slow, where our method is approximately a magnitude faster in the inference stage compared to normal inference in the image domain. Secondly, often multiple images, even though they describe the same scene result in inconsistent labelling. By solving a single mesh, we remove the inconsistency of labelling across the images. Also our mesh based labelling takes into account of the object layout in the scene, which is often ambiguous in the image domain, thereby increasing the accuracy of object labelling. Finally, we perform labelling and structure computation through a hierarchical robust PN Markov Random Field defined on voxels and super-voxels given by an octree. This allows us to infer the 3D structure and the object-class labels in a principled manner, through bounded approximate minimisation of a well defined and studied energy functional. In this thesis, we also introduce two object labelled datasets created from real world data. The 15 kilometre Yotta Labelled dataset consists of 8,000 images per camera view of the roadways of the United Kingdom with a subset of them annotated with object class labels and the second dataset is comprised of ground truth object labels for the publicly available KITTI dataset. Both the datasets are available publicly and we hope will be helpful to the vision research community

    Visual Perception For Robotic Spatial Understanding

    Get PDF
    Humans understand the world through vision without much effort. We perceive the structure, objects, and people in the environment and pay little direct attention to most of it, until it becomes useful. Intelligent systems, especially mobile robots, have no such biologically engineered vision mechanism to take for granted. In contrast, we must devise algorithmic methods of taking raw sensor data and converting it to something useful very quickly. Vision is such a necessary part of building a robot or any intelligent system that is meant to interact with the world that it is somewhat surprising we don\u27t have off-the-shelf libraries for this capability. Why is this? The simple answer is that the problem is extremely difficult. There has been progress, but the current state of the art is impressive and depressing at the same time. We now have neural networks that can recognize many objects in 2D images, in some cases performing better than a human. Some algorithms can also provide bounding boxes or pixel-level masks to localize the object. We have visual odometry and mapping algorithms that can build reasonably detailed maps over long distances with the right hardware and conditions. On the other hand, we have robots with many sensors and no efficient way to compute their relative extrinsic poses for integrating the data in a single frame. The same networks that produce good object segmentations and labels in a controlled benchmark still miss obvious objects in the real world and have no mechanism for learning on the fly while the robot is exploring. Finally, while we can detect pose for very specific objects, we don\u27t yet have a mechanism that detects pose that generalizes well over categories or that can describe new objects efficiently. We contribute algorithms in four of the areas mentioned above. First, we describe a practical and effective system for calibrating many sensors on a robot with up to 3 different modalities. Second, we present our approach to visual odometry and mapping that exploits the unique capabilities of RGB-D sensors to efficiently build detailed representations of an environment. Third, we describe a 3-D over-segmentation technique that utilizes the models and ego-motion output in the previous step to generate temporally consistent segmentations with camera motion. Finally, we develop a synthesized dataset of chair objects with part labels and investigate the influence of parts on RGB-D based object pose recognition using a novel network architecture we call PartNet

    Long-term localization of unmanned aerial vehicles based on 3D environment perception

    Get PDF
    Los vehículos aéreos no tripulados (UAVs por sus siglas en inglés, Unmanned Aerial Vehicles) se utilizan actualmente en innumerables aplicaciones civiles y comerciales, y la tendencia va en aumento. Su operación en espacios exteriores libres de obstáculos basada en GPS (del inglés Global Positioning System) puede ser considerada resuelta debido a la disponibilidad de productos comerciales con cierto grado de madurez. Sin embargo, algunas aplicaciones requieren su uso en espacios confinados o en interiores, donde las señales del GPS no están disponibles. Para permitir la introducción de robots aéreos de manera segura en zonas sin cobertura GPS, es necesario mejorar la fiabilidad en determinadas tecnologías clave para conseguir una operación robusta del sistema, tales como la localización, la evitación de obstáculos y la planificación de trayectorias. Actualmente, las técnicas existentes para la navegación autónoma de robots móviles en zonas sin GPS no son suficientemente fiables cuando se trata de robots aéreos, o no son robustas en el largo plazo. Esta tesis aborda el problema de la localización, proponiendo una metodología adecuada para robots aéreos que se mueven en un entorno tridimensional, utilizando para ello una combinación de medidas obtenidas a partir de varios sensores a bordo. Nos hemos centrado en la fusión de datos procedentes de tres tipos de sensores: imágenes y nubes de puntos adquiridas a partir de cámaras estéreo o de luz estructurada (RGB-D), medidas inerciales de una IMU (del inglés Inertial Measurement Unit) y distancias entre radiobalizas de tecnología UWB (del inglés Ultra Wide-Band) instaladas en el entorno y en la propia aeronave. La localización utiliza un mapa 3D del entorno, para el cual se presenta también un algoritmo de mapeado que explora las sinergias entre nubes de puntos y radiobalizas, con el fin de poder utilizar la metodología al completo en cualquier escenario dado. Las principales contribuciones de esta tesis doctoral se centran en una cuidadosa combinación de tecnologías para lograr una localización de UAVs en interiores válida para operaciones a largo plazo, de manera que sea robusta, fiable y eficiente computacionalmente. Este trabajo ha sido validado y demostrado durante los últimos cuatro años en el contexto de diferentes proyectos de investigación relacionados con la localización y estimación del estado de robots aéreos en zonas sin cobertura GPS. En particular en el proyecto European Robotics Challenges (EuRoC), en el que el autor participa en la competición entre las principales instituciones de investigación de Europa. Los resultados experimentales demuestran la viabilidad de la metodología completa, tanto en términos de precisión como en eficiencia computacional, probados a través de vuelos reales en interiores y siendo éstos validados con datos de un sistema de captura de movimiento.Unmanned Aerial Vehicles (UAVs) are currently used in countless civil and commercial applications, and the trend is rising. Outdoor obstacle-free operation based on Global Positioning System (GPS) can be generally assumed thanks to the availability of mature commercial products. However, some applications require their use in confined spaces or indoors, where GPS signals are not available. In order to allow for the safe introduction of autonomous aerial robots in GPS-denied areas, there is still a need for reliability in several key technologies to procure a robust operation, such as localization, obstacle avoidance and planning. Existing approaches for autonomous navigation in GPS-denied areas are not robust enough when it comes to aerial robots, or fail in long-term operation. This dissertation handles the localization problem, proposing a methodology suitable for aerial robots moving in a Three Dimensional (3D) environment using a combination of measurements from a variety of on-board sensors. We have focused on fusing three types of sensor data: images and 3D point clouds acquired from stereo or structured light cameras, inertial information from an on-board Inertial Measurement Unit (IMU), and distance measurements to several Ultra Wide-Band (UWB) radio beacons installed in the environment. The overall approach makes use of a 3D map of the environment, for which a mapping method that exploits the synergies between point clouds and radio-based sensing is also presented, in order to be able to use the whole methodology in any given scenario. The main contributions of this dissertation focus on a thoughtful combination of technologies in order to achieve robust, reliable and computationally efficient long-term localization of UAVs in indoor environments. This work has been validated and demonstrated for the past four years in the context of different research projects related to the localization and state estimation of aerial robots in GPS-denied areas. In particular the European Robotics Challenges (EuRoC) project, in which the author is participating in the competition among top research institutions in Europe. Experimental results demonstrate the feasibility of our full approach, both in accuracy and computational efficiency, which is tested through real indoor flights and validated with data from a motion capture system

    Safe and accurate MAV Control, navigation and manipulation

    Get PDF
    This work focuses on the problem of precise, aggressive and safe Micro Aerial Vehicle (MAV) navigation as well as deployment in applications which require physical interaction with the environment. To address these issues, we propose three different MAV model based control algorithms that rely on the concept of receding horizon control. As a starting point, we present a computationally cheap algorithm which utilizes an approximate linear model of the system around hover and is thus maximally accurate for slow reference maneuvers. Aiming at overcoming the limitations of the linear model parameterisation, we present an extension to the first controller which relies on the true nonlinear dynamics of the system. This approach, even though computationally more intense, ensures that the control model is always valid and allows tracking of full state aggressive trajectories. The last controller addresses the topic of aerial manipulation in which the versatility of aerial vehicles is combined with the manipulation capabilities of robotic arms. The proposed method relies on the formulation of a hybrid nonlinear MAV-arm model which also takes into account the effects of contact with the environment. Finally, in order to enable safe operation despite the potential loss of an actuator, we propose a supervisory algorithm which estimates the health status of each motor. We further showcase how this can be used in conjunction with the nonlinear controllers described above for fault tolerant MAV flight. While all the developed algorithms are formulated and tested using our specific MAV platforms (consisting of underactuated hexacopters for the free flight experiments, hexacopter-delta arm system for the manipulation experiments), we further discuss how these can be applied to other underactuated/overactuated MAVs and robotic arm platforms. The same applies to the fault tolerant control where we discuss different stabilisation techniques depending on the capabilities of the available hardware. Even though the primary focus of this work is on feedback control, we thoroughly describe the custom hardware platforms used for the experimental evaluation, the state estimation algorithms which provide the basis for control as well as the parameter identification required for the formulation of the various control models. We showcase all the developed algorithms in experimental scenarios designed to highlight the corresponding strengths and weaknesses as well as show that the proposed methods can run in realtime on commercially available hardware.Open Acces
    corecore