69 research outputs found

    Single and multiple stereo view navigation for planetary rovers

    Get PDF
    © Cranfield UniversityThis thesis deals with the challenge of autonomous navigation of the ExoMars rover. The absence of global positioning systems (GPS) in space, added to the limitations of wheel odometry makes autonomous navigation based on these two techniques - as done in the literature - an inviable solution and necessitates the use of other approaches. That, among other reasons, motivates this work to use solely visual data to solve the robot’s Egomotion problem. The homogeneity of Mars’ terrain makes the robustness of the low level image processing technique a critical requirement. In the first part of the thesis, novel solutions are presented to tackle this specific problem. Detection of robust features against illumination changes and unique matching and association of features is a sought after capability. A solution for robustness of features against illumination variation is proposed combining Harris corner detection together with moment image representation. Whereas the first provides a technique for efficient feature detection, the moment images add the necessary brightness invariance. Moreover, a bucketing strategy is used to guarantee that features are homogeneously distributed within the images. Then, the addition of local feature descriptors guarantees the unique identification of image cues. In the second part, reliable and precise motion estimation for the Mars’s robot is studied. A number of successful approaches are thoroughly analysed. Visual Simultaneous Localisation And Mapping (VSLAM) is investigated, proposing enhancements and integrating it with the robust feature methodology. Then, linear and nonlinear optimisation techniques are explored. Alternative photogrammetry reprojection concepts are tested. Lastly, data fusion techniques are proposed to deal with the integration of multiple stereo view data. Our robust visual scheme allows good feature repeatability. Because of this, dimensionality reduction of the feature data can be used without compromising the overall performance of the proposed solutions for motion estimation. Also, the developed Egomotion techniques have been extensively validated using both simulated and real data collected at ESA-ESTEC facilities. Multiple stereo view solutions for robot motion estimation are introduced, presenting interesting benefits. The obtained results prove the innovative methods presented here to be accurate and reliable approaches capable to solve the Egomotion problem in a Mars environment

    Robust airborne 3D visual simultaneous localisation and mapping

    Get PDF
    The aim of this thesis is to present robust solutions to technical problems of airborne three-dimensional (3D) Visual Simultaneous Localisation And Mapping (VSLAM). These solutions are developed based on a stereovision system available onboard Unmanned Aerial Vehicles (UAVs). The proposed airborne VSLAM enables unmanned aerial vehicles to construct a reliable map of an unknown environment and localise themselves within this map without any user intervention. Current research challenges related to Airborne VSLAM include the visual processing through invariant feature detectors/descriptors, efficient mapping of large environments and cooperative navigation and mapping of complex environments. Most of these challenges require scalable representations, robust data association algorithms, consistent estimation techniques, and fusion of different sensor modalities. To deal with these challenges, seven Chapters are presented in this thesis as follows: Chapter 1 introduces UAVs, definitions, current challenges and different applications. Next, in Chapter 2 we present the main sensors used by UAVs during navigation. Chapter 3 presents an important task for autonomous navigation which is UAV localisation. In this chapter, some robust and optimal approaches for data fusion are proposed with performance analysis. After that, UAV map building is presented in Chapter 4. This latter is divided into three parts. In the first part, a new imaging alternative technique is proposed to extract and match a suitable number of invariant features. The second part presents an image mosaicing algorithm followed by a super-resolution approach. In the third part, we propose a new feature detector and descriptor that is fast, robust and detect suitable number of features to solve the VSLAM problem. A complete Airborne Visual Simultaneous Localisation and Mapping (VSLAM) solution based on a stereovision system is presented in Chapter (5). Robust data association filters with consistency and observability analysis are presented in this chapter as well. The proposed algorithm is validated with loop closing detection and map management using experimental data. The airborne VSLAM is extended then to the multiple UAVs case in Chapter (6). This chapter presents two architectures of cooperation: a Centralised and a Decentralised. The former provides optimal precision in terms of UAV positions and constructed map while the latter is more suitable for real time and embedded system applications. Finally, conclusions and future works are presented in Chapter (7).EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    An Improved Extended Information Filter SLAM Algorithm Based on Omnidirectional Vision

    Get PDF
    In the SLAM application, omnidirectional vision extracts wide scale information and more features from environments. Traditional algorithms bring enormous computational complexity to omnidirectional vision SLAM. An improved extended information filter SLAM algorithm based on omnidirectional vision is presented in this paper. Based on the analysis of structure a characteristics of the information matrix, this algorithm improves computational efficiency. Considering the characteristics of omnidirectional images, an improved sparsification rule is also proposed. The sparse observation information has been utilized and the strongest global correlation has been maintained. So the accuracy of the estimated result is ensured by using proper sparsification of the information matrix. Then, through the error analysis, the error caused by sparsification can be eliminated by a relocation method. The results of experiments show that this method makes full use of the characteristic of repeated observations for landmarks in omnidirectional vision and maintains great efficiency and high reliability in mapping and localization

    Mapping and Semantic Perception for Service Robotics

    Get PDF
    Para realizar una tarea, los robots deben ser capaces de ubicarse en el entorno. Si un robot no sabe dónde se encuentra, es imposible que sea capaz de desplazarse para alcanzar el objetivo de su tarea. La localización y construcción de mapas simultánea, llamado SLAM, es un problema estudiado en la literatura que ofrece una solución a este problema. El objetivo de esta tesis es desarrollar técnicas que permitan a un robot comprender el entorno mediante la incorporación de información semántica. Esta información también proporcionará una mejora en la localización y navegación de las plataformas robóticas. Además, también demostramos cómo un robot con capacidades limitadas puede construir de forma fiable y eficiente los mapas semánticos necesarios para realizar sus tareas cotidianas.El sistema de construcción de mapas presentado tiene las siguientes características: En el lado de la construcción de mapas proponemos la externalización de cálculos costosos a un servidor en nube. Además, proponemos métodos para registrar información semántica relevante con respecto a los mapas geométricos estimados. En cuanto a la reutilización de los mapas construidos, proponemos un método que combina la construcción de mapas con la navegación de un robot para explorar mejor un entorno y disponer de un mapa semántico con los objetos relevantes para una misión determinada.En primer lugar, desarrollamos un algoritmo semántico de SLAM visual que se fusiona los puntos estimados en el mapa, carentes de sentido, con objetos conocidos. Utilizamos un sistema monocular de SLAM basado en un EKF (Filtro Extendido de Kalman) centrado principalmente en la construcción de mapas geométricos compuestos únicamente por puntos o bordes; pero sin ningún significado o contenido semántico asociado. El mapa no anotado se construye utilizando sólo la información extraída de una secuencia de imágenes monoculares. La parte semántica o anotada del mapa -los objetos- se estiman utilizando la información de la secuencia de imágenes y los modelos de objetos precalculados. Como segundo paso, mejoramos el método de SLAM presentado anteriormente mediante el diseño y la implementación de un método distribuido. La optimización de mapas y el almacenamiento se realiza como un servicio en la nube, mientras que el cliente con poca necesidad de computo, se ejecuta en un equipo local ubicado en el robot y realiza el cálculo de la trayectoria de la cámara. Los ordenadores con los que está equipado el robot se liberan de la mayor parte de los cálculos y el único requisito adicional es una conexión a Internet.El siguiente paso es explotar la información semántica que somos capaces de generar para ver cómo mejorar la navegación de un robot. La contribución en esta tesis se centra en la detección 3D y en el diseño e implementación de un sistema de construcción de mapas semántico.A continuación, diseñamos e implementamos un sistema de SLAM visual capaz de funcionar con robustez en entornos poblados debido a que los robots de servicio trabajan en espacios compartidos con personas. El sistema presentado es capaz de enmascarar las zonas de imagen ocupadas por las personas, lo que aumenta la robustez, la reubicación, la precisión y la reutilización del mapa geométrico. Además, calcula la trayectoria completa de cada persona detectada con respecto al mapa global de la escena, independientemente de la ubicación de la cámara cuando la persona fue detectada.Por último, centramos nuestra investigación en aplicaciones de rescate y seguridad. Desplegamos un equipo de robots en entornos que plantean múltiples retos que implican la planificación de tareas, la planificación del movimiento, la localización y construcción de mapas, la navegación segura, la coordinación y las comunicaciones entre todos los robots. La arquitectura propuesta integra todas las funcionalidades mencionadas, asi como varios aspectos de investigación novedosos para lograr una exploración real, como son: localización basada en características semánticas-topológicas, planificación de despliegue en términos de las características semánticas aprendidas y reconocidas, y construcción de mapas.In order to perform a task, robots need to be able to locate themselves in the environment. If a robot does not know where it is, it is impossible for it to move, reach its goal and complete the task. Simultaneous Localization and Mapping, known as SLAM, is a problem extensively studied in the literature for enabling robots to locate themselves in unknown environments. The goal of this thesis is to develop and describe techniques to allow a service robot to understand the environment by incorporating semantic information. This information will also provide an improvement in the localization and navigation of robotic platforms. In addition, we also demonstrate how a simple robot can reliably and efficiently build the semantic maps needed to perform its quotidian tasks. The mapping system as built has the following features. On the map building side we propose the externalization of expensive computations to a cloud server. Additionally, we propose methods to register relevant semantic information with respect to the estimated geometrical maps. Regarding the reuse of the maps built, we propose a method that combines map building with robot navigation to better explore a room in order to obtain a semantic map with the relevant objects for a given mission. Firstly, we develop a semantic Visual SLAM algorithm that merges traditional with known objects in the estimated map. We use a monocular EKF (Extended Kalman Filter) SLAM system that has mainly been focused on producing geometric maps composed simply of points or edges but without any associated meaning or semantic content. The non-annotated map is built using only the information extracted from an image sequence. The semantic or annotated parts of the map –the objects– are estimated using the information in the image sequence and the precomputed object models. As a second step we improve the EKF SLAM presented previously by designing and implementing a visual SLAM system based on a distributed framework. The expensive map optimization and storage is allocated as a service in the Cloud, while a light camera tracking client runs on a local computer. The robot’s onboard computers are freed from most of the computation, the only extra requirement being an internet connection. The next step is to exploit the semantic information that we are able to generate to see how to improve the navigation of a robot. The contribution of this thesis is focused on 3D sensing which we use to design and implement a semantic mapping system. We then design and implement a visual SLAM system able to perform robustly in populated environments due to service robots work in environments where people are present. The system is able to mask the image regions occupied by people out of the rigid SLAM pipeline, which boosts the robustness, the relocation, the accuracy and the reusability of the geometrical map. In addition, it estimates the full trajectory of each detected person with respect to the scene global map, irrespective of the location of the moving camera at the point when the people were imaged. Finally, we focus our research on rescue and security applications. The deployment of a multirobot team in confined environments poses multiple challenges that involve task planning, motion planning, localization and mapping, safe navigation, coordination and communications among all the robots. The architecture integrates, jointly with all the above-mentioned functionalities, several novel features to achieve real exploration: localization based on semantic-topological features, deployment planning in terms of the semantic features learned and recognized, and map building.<br /

    Deep Reinforcement Learning for Robotic Tasks: Manipulation and Sensor Odometry

    Get PDF
    Research in robotics has frequently focused on artificial intelligence (AI). To increase the effectiveness of the learning process for the robot, numerous studies have been carried out. To be more effective, robots must be able to learn effectively in a shorter amount of time and with fewer resources. It has been established that reinforcement learning (RL) is efficient for aiding a robot's learning. In this dissertation, we proposed and optimized RL algorithms to ensure that our robots learn well. Research into driverless or self-driving automobiles has exploded in the last few years. A precise estimation of the vehicle's motion is crucial for higher levels of autonomous driving functionality. Recent research has been done on the development of sensors to improve the localization accuracy of these vehicles. Recent sensor odometry research suggests that Lidar Monocular Visual Odometry (LIMO) can be beneficial for determining odometry. However, the LIMO algorithm has a considerable number of errors when compared to ground truth, which motivates us to investigate ways to make it far more accurate. We intend to use a Genetic Algorithm (GA) in our dissertation to improve LIMO's performance. Robotic manipulator research has also been popular and has room for development, which piqued our interest. As a result, we researched robotic manipulators and applied GA to Deep Deterministic Policy Gradient (DDPG) and Hindsight Experience Replay (HER) (GA+DDPG+HER). Finally, we kept researching DDPG and created an algorithm named AACHER. AACHER uses HER and many independent instances of actors and critics from the DDPG to increase a robot's learning effectiveness. AACHER is used to evaluate the results in both custom and existing robot environments.In the first part of our research, we discuss the LIMO algorithm, an odometry estimation technique that employs a camera and a Lidar for visual localization by tracking features from their measurements. LIMO can estimate sensor motion via Bundle Adjustment based on reliable keyframes. LIMO employs weights of the vegetative landmarks and semantic labeling to reject outliers. LIMO, like many other odometry estimating methods, has the issue of having a lot of hyperparameters that need to be manually modified in response to dynamic changes in the environment to reduce translational errors. The GA has been proven to be useful in determining near-optimal values of learning hyperparameters. In our study, we present and propose the application of the GA to maximize the performance of LIMO's localization and motion estimates by optimizing its hyperparameters. We test our approach using the well-known KITTI dataset and demonstrate how the GA supports LIMO to lower translation errors in various datasets. Our second contribution includes the use of RL. Robots using RL can select actions based on a reward function. On the other hand, the choice of values for the learning algorithm's hyperparameters could have a big impact on the entire learning process. We used GA to find the hyperparameters for the Deep Deterministic Policy Gradient (DDPG) and Hindsight Experience Replay (HER). We proposed the algorithm GA+DDPG+HER to optimize learning hyperparameters and applied it to the robotic manipulation tasks of FetchReach, FetchSlide, FetchPush, FetchPick\&Place, and DoorOpening. With only a few modifications, our proposed GA+DDPG+HER was also used in the AuboReach environment. Compared to the original algorithm (DDPG+HER), our experiments show that our approach (GA+DDPG+HER) yields noticeably better results and is substantially faster. In the final part of our dissertation, we were motivated to use and improve DDPG. Many simulated continuous control problems have shown promising results for the DDPG, a unique Deep Reinforcement Learning (DRL) technique. DDPG has two parts: Actor learning and Critic learning. The performance of the DDPG technique is therefore relatively sensitive and unstable because actor and critic learning is a considerable contributor to the robot’s total learning. Our dissertation suggests a multi-actor-critic DDPG for reliable actor-critic learning as an improved DDPG to further enhance the performance and stability of DDPG. This multi-actor-critic DDPG is further combined with HER, called AACHER. The average value of numerous actors/critics is used to replace the single actor/critic in the traditional DDPG approach for improved resistance when one actor/critic performs poorly. Numerous independent actors and critics can also learn from the environment in general. In all the actor/critic number combinations that are evaluated, AACHER performs better than DDPG+HER

    From images to augmented 3D models: improved visual SLAM and augmented point cloud modeling

    Get PDF
    This thesis investigates into the problem of using monocular image sequences to generate augmented models. The problem is decomposed to two subproblems: monocular visual simultaneously localization and mapping (VSLAM), and the point cloud data modeling. Accordingly, the thesis comprises two major parts. The First part, including Chapters 2, 3 and 4, aims to leverage the system observability theories to improve the VSLAM accuracy. In Chapter 2, a piece-wise linear system is developed to model VSLAM, and two necessary conditions are proved to make the VSLAM completely observable. Based on the First condition, an instantaneous condition for complete observability, the "Optimally Observable and Minimal Cardinality (OOMC) VSLAM" is presented in Chapter 3. The OOMC algorithm selects the feature subset of minimal required cardinality to form the strongest observable VSLAM subsystem. The select feature subset is further used to improve the data association in VSLAM. Based on the second condition, a temporal condition for complete observability, the "Good Features (GF) to Track for VSLAM" is presented in Chapter 4. The GF algorithm ranks the individual features according to their contributions to system observability. Benchmarking experiments of both OOMC and GF algorithms demonstrate improvements in VSLAM performance. The second part, including Chapters 5 and 6, aims to solve the PCD modeling problem in a geometry-driven manner. Chapter 5 presents an algorithm to model PCDs with planar patches via a sparsity-inducing optimization. Chapter 6 extends the PCD modeling to quadratic surface primitives based models. A method is further developed to retrieve the high-level semantic information of the model components. Evaluation on the PCDs generated from VSLAM demonstrates the effectiveness of these geometry-driven PCD modeling approaches.Ph.D

    Medical SLAM in an autonomous robotic system

    Get PDF
    One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-operative morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilities by observing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted instruments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This thesis addresses the ambitious goal of achieving surgical autonomy, through the study of the anatomical environment by Initially studying the technology present and what is needed to analyze the scene: vision sensors. A novel endoscope for autonomous surgical task execution is presented in the first part of this thesis. Which combines a standard stereo camera with a depth sensor. This solution introduces several key advantages, such as the possibility of reconstructing the 3D at a greater distance than traditional endoscopes. Then the problem of hand-eye calibration is tackled, which unites the vision system and the robot in a single reference system. Increasing the accuracy in the surgical work plan. In the second part of the thesis the problem of the 3D reconstruction and the algorithms currently in use were addressed. In MIS, simultaneous localization and mapping (SLAM) can be used to localize the pose of the endoscopic camera and build ta 3D model of the tissue surface. Another key element for MIS is to have real-time knowledge of the pose of surgical tools with respect to the surgical camera and underlying anatomy. Starting from the ORB-SLAM algorithm we have modified the architecture to make it usable in an anatomical environment by adding the registration of the pre-operative information of the intervention to the map obtained from the SLAM. Once it has been proven that the slam algorithm is usable in an anatomical environment, it has been improved by adding semantic segmentation to be able to distinguish dynamic features from static ones. All the results in this thesis are validated on training setups, which mimics some of the challenges of real surgery and on setups that simulate the human body within Autonomous Robotic Surgery (ARS) and Smart Autonomous Robotic Assistant Surgeon (SARAS) projects
    • …
    corecore