58 research outputs found

    Robust convex optimisation techniques for autonomous vehicle vision-based navigation

    Get PDF
    This thesis investigates new convex optimisation techniques for motion and pose estimation. Numerous computer vision problems can be formulated as optimisation problems. These optimisation problems are generally solved via linear techniques using the singular value decomposition or iterative methods under an L2 norm minimisation. Linear techniques have the advantage of offering a closed-form solution that is simple to implement. The quantity being minimised is, however, not geometrically or statistically meaningful. Conversely, L2 algorithms rely on iterative estimation, where a cost function is minimised using algorithms such as Levenberg-Marquardt, Gauss-Newton, gradient descent or conjugate gradient. The cost functions involved are geometrically interpretable and can statistically be optimal under an assumption of Gaussian noise. However, in addition to their sensitivity to initial conditions, these algorithms are often slow and bear a high probability of getting trapped in a local minimum or producing infeasible solutions, even for small noise levels. In light of the above, in this thesis we focus on developing new techniques for finding solutions via a convex optimisation framework that are globally optimal. Presently convex optimisation techniques in motion estimation have revealed enormous advantages. Indeed, convex optimisation ensures getting a global minimum, and the cost function is geometrically meaningful. Moreover, robust optimisation is a recent approach for optimisation under uncertain data. In recent years the need to cope with uncertain data has become especially acute, particularly where real-world applications are concerned. In such circumstances, robust optimisation aims to recover an optimal solution whose feasibility must be guaranteed for any realisation of the uncertain data. Although many researchers avoid uncertainty due to the added complexity in constructing a robust optimisation model and to lack of knowledge as to the nature of these uncertainties, and especially their propagation, in this thesis robust convex optimisation, while estimating the uncertainties at every step is investigated for the motion estimation problem. First, a solution using convex optimisation coupled to the recursive least squares (RLS) algorithm and the robust H filter is developed for motion estimation. In another solution, uncertainties and their propagation are incorporated in a robust L convex optimisation framework for monocular visual motion estimation. In this solution, robust least squares is combined with a second order cone program (SOCP). A technique to improve the accuracy and the robustness of the fundamental matrix is also investigated in this thesis. This technique uses the covariance intersection approach to fuse feature location uncertainties, which leads to more consistent motion estimates. Loop-closure detection is crucial in improving the robustness of navigation algorithms. In practice, after long navigation in an unknown environment, detecting that a vehicle is in a location it has previously visited gives the opportunity to increase the accuracy and consistency of the estimate. In this context, we have developed an efficient appearance-based method for visual loop-closure detection based on the combination of a Gaussian mixture model with the KD-tree data structure. Deploying this technique for loop-closure detection, a robust L convex posegraph optimisation solution for unmanned aerial vehicle (UAVs) monocular motion estimation is introduced as well. In the literature, most proposed solutions formulate the pose-graph optimisation as a least-squares problem by minimising a cost function using iterative methods. In this work, robust convex optimisation under the L norm is adopted, which efficiently corrects the UAV’s pose after loop-closure detection. To round out the work in this thesis, a system for cooperative monocular visual motion estimation with multiple aerial vehicles is proposed. The cooperative motion estimation employs state-of-the-art approaches for optimisation, individual motion estimation and registration. Three-view geometry algorithms in a convex optimisation framework are deployed on board the monocular vision system for each vehicle. In addition, vehicle-to-vehicle relative pose estimation is performed with a novel robust registration solution in a global optimisation framework. In parallel, and as a complementary solution for the relative pose, a robust non-linear H solution is designed as well to fuse measurements from the UAVs’ on-board inertial sensors with the visual estimates. The suggested contributions have been exhaustively evaluated over a number of real-image data experiments in the laboratory using monocular vision systems and range imaging devices. In this thesis, we propose several solutions towards the goal of robust visual motion estimation using convex optimisation. We show that the convex optimisation framework may be extended to include uncertainty information, to achieve robust and optimal solutions. We observed that convex optimisation is a practical and very appealing alternative to linear techniques and iterative methods

    CAPRICORN: Communication Aware Place Recognition using Interpretable Constellations of Objects in Robot Networks

    Full text link
    Using multiple robots for exploring and mapping environments can provide improved robustness and performance, but it can be difficult to implement. In particular, limited communication bandwidth is a considerable constraint when a robot needs to determine if it has visited a location that was previously explored by another robot, as it requires for robots to share descriptions of places they have visited. One way to compress this description is to use constellations, groups of 3D points that correspond to the estimate of a set of relative object positions. Constellations maintain the same pattern from different viewpoints and can be robust to illumination changes or dynamic elements. We present a method to extract from these constellations compact spatial and semantic descriptors of the objects in a scene. We use this representation in a 2-step decentralized loop closure verification: first, we distribute the compact semantic descriptors to determine which other robots might have seen scenes with similar objects; then we query matching robots with the full constellation to validate the match using geometric information. The proposed method requires less memory, is more interpretable than global image descriptors, and could be useful for other tasks and interactions with the environment. We validate our system's performance on a TUM RGB-D SLAM sequence and show its benefits in terms of bandwidth requirements.Comment: 8 pages, 6 figures, 1 table. 2020 IEEE International Conference on Robotics and Automation (ICRA

    Machine Learning for Multi-Robot Semantic Simultaneous Localization and Mapping

    Get PDF
    RÉSUMÉ L’automatisation et la robotique prennent une place de plus en plus importante dans notre vie quotidienne, avec de nombreuses utilisations possibles. Les robots pourraient nous épargner des tâches dangereuses et pénibles, ou rendre des choses impossibles jusqu’à maintenant possibles. Pour que les robots s’intègrent en toute sécurité dans notre monde et dans de nouveaux environnements inconnus, il est clef qu’ils soient équipés d’une capacité de per-ception, et en particulier qu’ils puissent se localiser par rapport à leur entourage. Afin d’être réellement indépendants, les robots doivent pouvoir le faire en se basant uniquement sur leurs propres capteurs, les plus couramment utilisés étant les caméras. Une solution pour obtenir de telles estimations est d’utiliser un algorithme de cartographie et localisa-tion simultanée (SLAM), dans lequel le robot va simultanément construire une carte de son environnement et estimer son propre état. Le SLAM avec un seul robot a fait l’objet de nombreux travaux scientifiques, et est désormais considéré comme un domaine de recherche mature. Cependant, l’utilisation d’une équipe de robots peut o˙rir plusieurs avantages en termes de robustesse, d’eÿcacité et de performances pour de nombreuses tâches. Dans ce cas, des algorithmes de SLAM multi-robots sont nécessaires pour permettre à chaque robot de bénéficier de l’expérience de toute l’équipe. Le SLAM multi-robot peut s’appuyer sur des solutions SLAM classiques, mais nécessite des adaptations et fait face à des contraintes de calculs et de communications supplémentaires. Un défi particulier dans le SLAM multi-robots est la nécessité pour les robots de trouver des fermetures de boucles inter-robots: des liens entre les trajectoires de di˙érents robots qui peuvent être trouvés lorsqu’ils visitent le même endroit. Deux catégories d’approches sont possibles pour détecter les fermetures de boucles inter-robots. Dans les méthodes indirectes, les robots communiquent pour vérifier s’ils ont cartographié un espace commun, puis tentent de trouver des fermetures de boucles à partir des données recueillies par chacun des robots dans cet espace. Dans les méthodes directes, les robots s’appuient directement sur les données de leurs capteurs pour estimer les fermetures de boucles. Chaque approche a des avantages et des inconvénients, mais les méthodes indi-rectes ont été plus étudiées récemment. Ce mémoire s’appuie sur les avancées récentes de la vision par ordinateur pour présenter des contributions à chaque catégorie d’approches pour la détection de fermetures de boucles inter-robots. Une première contribution est présentée pour la détection de fermetures de boucles indirecte dans une équipe de robots entièrement en communication. Elle utilise des constellations, une représentation sémantique compacte de l’environnement basée sur les objets qui le compose.----------ABSTRACT Automation and robotics are becoming more and more common in our daily lives, with many possible applications. Deploying robots in the world can extend what humans are capable of doing, and can save us from dangerous and strenuous tasks. For robots to be safely sent out in our real world, and in new unknown environments, one key capability they need is to perceive their environment, and particularly to localize themselves with respect to their surroundings. To truly be able to be deployed anywhere, robots should be able to do so relying only on their sensors, the most commonly used being cameras. One way to generate such an estimate is by using a simultaneous localization and mapping (SLAM) algorithm, in which the robot will concurrently build a map of its environment and estimate its state within it. Single-robot SLAM has been extensively researched and is now considered a mature field. However, using a team of robots can provide several benefits in terms of robustness, eÿciency, and performance for many tasks. In this case, multi-robot SLAM algorithms are required to allow each robot to benefit from the whole team’s experience. Multi-robot SLAM can build on top of single-robot SLAM solutions, but requires adaptations and faces computation and communication constraints. One particular challenge that arises in multi-robot SLAM is the need for robots to find inter-robot loop closures: relationships between trajectories of di˙erent robots that can be found when they visit the same place. Two categories of approaches are possible to detect inter-robot loop closures. In indirect methods, robots communicate to find if they have mapped the same area, and then attempt to find loop closures using data gathered by each robot in the place that was jointly visited. In direct methods, robots directly rely on data they gather from their sensors to estimate the loop closures. Each approach has its own benefits and challenges, with indirect methods being more popular in recent works. This thesis builds on recent computer vision advancements to present contributions to each category of approaches for inter-robot loop closure detection. A first approach is presented for indirect loop closure detection in a team of fully connected robots. It relies on constellations, a compact semantic representation of the environment based on objects that are in it. Descriptors and comparison methods for constellations are designed to robustly recognize places based on their constellation with minimal data exchange. These are used in a decentralized place recognition mechanism that is scalable as the size of the team increases. The proposed method performs comparably to state-of-the-art solutions in terms of performance and data exchanges require, while being more meaningful and interpretable

    Dense Visual Simultaneous Localisation and Mapping in Collaborative and Outdoor Scenarios

    Get PDF
    Dense visual simultaneous localisation and mapping (SLAM) systems can produce 3D reconstructions that are digital facsimiles of the physical space they describe. Systems that can produce dense maps with this level of fidelity in real time provide foundational spatial reasoning capabilities for many downstream tasks in autonomous robotics. Over the past 15 years, mapping small scale, indoor environments, such as desks and buildings, with a single slow moving, hand-held sensor has been one of the central focuses of dense visual SLAM research. However, most dense visual SLAM systems exhibit a number of limitations which mean they cannot be directly applied in collaborative or outdoors settings. The contribution of this thesis is to address these limitations with the development of new systems and algorithms for collaborative dense mapping, efficient dense alternation and outdoors operation with fast camera motion and wide field of view (FOV) cameras. We use ElasticFusion, a state-of-the-art dense SLAM system, as our starting point where each of these contributions is implemented as a novel extension to the system. We first present a collaborative dense SLAM system that allows a number of cameras starting with unknown initial relative positions to maintain local maps with the original ElasticFusion algorithm. Visual place recognition across local maps results in constraints that allow maps to be aligned into a common global reference frame, facilitating collaborative mapping and tracking of multiple cameras within a shared map. Within dense alternation based SLAM systems, the standard approach is to fuse every frame into the dense model without considering whether the information contained within the frame is already captured by the dense map and therefore redundant. As the number of cameras or the scale of the map increases, this approach becomes inefficient. In our second contribution, we address this inefficiency by introducing a novel information theoretic approach to keyframe selection that allows the system to avoid processing redundant information. We implement the procedure within ElasticFusion, demonstrating a marked reduction in the number of frames required by the system to estimate an accurate, denoised surface reconstruction. Before dense SLAM techniques can be applied in outdoor scenarios we must first address their reliance on active depth cameras, and their lack of suitability to fast camera motion. In our third contribution we present an outdoor dense SLAM system. The system overcomes the need for an active sensor by employing neural network-based depth inference to predict the geometry of the scene as it appears in each image. To address the issue of camera tracking during fast motion we employ a hybrid architecture, combining elements of both dense and sparse SLAM systems to perform camera tracking and to achieve globally consistent dense mapping. Automotive applications present a particularly important setting for dense visual SLAM systems. Such applications are characterised by their use of wide FOV cameras and are therefore not accurately modelled by the standard pinhole camera model. The fourth contribution of this thesis is to extend the above hybrid sparse-dense monocular SLAM system to cater for large FOV fisheye imagery. This is achieved by reformulating the mapping pipeline in terms of the Kannala-Brandt fisheye camera model. To estimate depth, we introduce a new version of the PackNet depth estimation neural network (Guizilini et al., 2020) adapted for fisheye inputs. To demonstrate the effectiveness of our contributions, we present experimental results, computed by processing the synthetic ICL-NUIM dataset of Handa et al. (2014) as well as the real-world TUM-RGBD dataset of Sturm et al. (2012). For outdoor SLAM we show the results of our system processing the autonomous driving KITTI and KITTI-360 datasets of Geiger et al. (2012a) and Liao et al. (2021) respectively

    Cooperative Navigation for Low-bandwidth Mobile Acoustic Networks.

    Full text link
    This thesis reports on the design and validation of estimation and planning algorithms for underwater vehicle cooperative localization. While attitude and depth are easily instrumented with bounded-error, autonomous underwater vehicles (AUVs) have no internal sensor that directly observes XY position. The global positioning system (GPS) and other radio-based navigation techniques are not available because of the strong attenuation of electromagnetic signals in seawater. The navigation algorithms presented herein fuse local body-frame rate and attitude measurements with range observations between vehicles within a decentralized architecture. The acoustic communication channel is both unreliable and low bandwidth, precluding many state-of-the-art terrestrial cooperative navigation algorithms. We exploit the underlying structure of a post-process centralized estimator in order to derive two real-time decentralized estimation frameworks. First, the origin state method enables a client vehicle to exactly reproduce the corresponding centralized estimate within a server-to-client vehicle network. Second, a graph-based navigation framework produces an approximate reconstruction of the centralized estimate onboard each vehicle. Finally, we present a method to plan a locally optimal server path to localize a client vehicle along a desired nominal trajectory. The planning algorithm introduces a probabilistic channel model into prior Gaussian belief space planning frameworks. In summary, cooperative localization reduces XY position error growth within underwater vehicle networks. Moreover, these methods remove the reliance on static beacon networks, which do not scale to large vehicle networks and limit the range of operations. Each proposed localization algorithm was validated in full-scale AUV field trials. The planning framework was evaluated through numerical simulation.PhDMechanical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/113428/1/jmwalls_1.pd

    Toward autonomous harbor surveillance

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.Includes bibliographical references (p. 105-113).In this thesis we address the problem of drift-free navigation for underwater vehicles performing harbor surveillance and ship hull inspection. Maintaining accurate localization for the duration of a mission is important for a variety of tasks, such as planning the vehicle trajectory and ensuring coverage of the area to be inspected. Our approach uses only onboard sensors in a simultaneous localization and mapping setting and removes the need for any external infrastructure like acoustic beacons. We extract dense features from a forward-looking imaging sonar and apply pair-wise registration between sonar frames. The registrations are combined with onboard velocity, attitude and acceleration sensors to obtain an improved estimate of the vehicle trajectory. In addition, an architecture for a persistent mapping is proposed. With the intention of handling long term operations and repetitive surveillance tasks. The proposed architecture is flexible and supports different types of vehicles and mapping methods. The design of the system is demonstrated with an implementation of some of the key features of the system. In addition, methods for re-localization are considered. Finally, results from several experiments that demonstrate drift-free navigation in various underwater environments are presented.by Hordur Johannsson.S.M

    An Outlook into the Future of Egocentric Vision

    Full text link
    What will the future be? We wonder! In this survey, we explore the gap between current research in egocentric vision and the ever-anticipated future, where wearable computing, with outward facing cameras and digital overlays, is expected to be integrated in our every day lives. To understand this gap, the article starts by envisaging the future through character-based stories, showcasing through examples the limitations of current technology. We then provide a mapping between this future and previously defined research tasks. For each task, we survey its seminal works, current state-of-the-art methodologies and available datasets, then reflect on shortcomings that limit its applicability to future research. Note that this survey focuses on software models for egocentric vision, independent of any specific hardware. The paper concludes with recommendations for areas of immediate explorations so as to unlock our path to the future always-on, personalised and life-enhancing egocentric vision.Comment: We invite comments, suggestions and corrections here: https://openreview.net/forum?id=V3974SUk1

    Collaborative Appearance-Based Place Recognition and Improving Place Recognition Using Detection of Dynamic Objects

    Full text link
    This dissertation makes contributions to the problem of Long-Term Appearance-Based Place Recognition. We present a framework for place recognition in a collaborative scheme and a method to reduce the impact of dynamic objects on place representations. We demonstrate our findings using a state-of-the-art place recognition approach. We begin in Part I by describing the general problem of place recognition and its importance in applications where accurate localization is crucial. We discuss feature detection and description and also explain the functioning of several place recognition frameworks. In Part II, we present a novel framework for collaboration between agents from a pure appearance-based place recognition perspective. Using this framework, multiple agents can efficiently share partial or complete knowledge about places and benefit from their teamwork. This collaborative framework allows agents with limited storage and memory capacity to become useful in environment exploration tasks (for instance, by enabling remote recognition); includes procedures to manage an agent’s memory load and distributes knowledge of places across agents; allows the reuse of knowledge from one agent to another; and increases the tolerance for failure of individual agents. Part II also defines metrics which allow us to measure the performance of a system that uses the collaborative framework. Finally, in Part III, we present an innovative method to improve the recognition of places in environments densely populated by dynamic objects. We demonstrate that we can improve the recognition performance in these environments by incorporating high- level information from dynamic objects. Tests conducted using a synthetic dataset show the benefits of our approach. The proposed method allows the system to significantly improve the recognition performance in the photo-realistic dataset while reducing storage requirements, resulting in up to 23.7 percent less storage space than the state-of-the-art approach that we have extended; smaller representations also reduced the time required to match places. In Part III, we also formulate the concept of a valid place representation and determine the quality of the observation based on dynamic objects present in the agent’s view. Of course, recognition systems that are sensitive to dynamic objects incur additional computational costs to recognize those objects. We show that this additional cost is outweighed by the benefits that incorporating dynamic object detection in the place recognition pipeline. Our findings can be used in many applications, including applications for navigation, e.g. assisting visually impaired individuals with navigating indoors, or autonomous vehicles

    Mapping of complex marine environments using an unmanned surface craft

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 185-199).Recent technology has combined accurate GPS localization with mapping to build 3D maps in a diverse range of terrestrial environments, but the mapping of marine environments lags behind. This is particularly true in shallow water and coastal areas with man-made structures such as bridges, piers, and marinas, which can pose formidable challenges to autonomous underwater vehicle (AUV) operations. In this thesis, we propose a new approach for mapping shallow water marine environments, combining data from both above and below the water in a robust probabilistic state estimation framework. The ability to rapidly acquire detailed maps of these environments would have many applications, including surveillance, environmental monitoring, forensic search, and disaster recovery. Whereas most recent AUV mapping research has been limited to open waters, far from man-made surface structures, in our work we focus on complex shallow water environments, such as rivers and harbors, where man-made structures block GPS signals and pose hazards to navigation. Our goal is to enable an autonomous surface craft to combine data from the heterogeneous environments above and below the water surface - as if the water were drained, and we had a complete integrated model of the marine environment, with full visibility. To tackle this problem, we propose a new framework for 3D SLAM in marine environments that combines data obtained concurrently from above and below the water in a robust probabilistic state estimation framework. Our work makes systems, algorithmic, and experimental contributions in perceptual robotics for the marine environment. We have created a novel Autonomous Surface Vehicle (ASV), equipped with substantial onboard computation and an extensive sensor suite that includes three SICK lidars, a Blueview MB2250 imaging sonar, a Doppler Velocity Log, and an integrated global positioning system/inertial measurement unit (GPS/IMU) device. The data from these sensors is processed in a hybrid metric/topological SLAM state estimation framework. A key challenge to mapping is extracting effective constraints from 3D lidar data despite GPS loss and reacquisition. This was achieved by developing a GPS trust engine that uses a semi-supervised learning classifier to ascertain the validity of GPS information for different segments of the vehicle trajectory. This eliminates the troublesome effects of multipath on the vehicle trajectory estimate, and provides cues for submap decomposition. Localization from lidar point clouds is performed using octrees combined with Iterative Closest Point (ICP) matching, which provides constraints between submaps both within and across different mapping sessions. Submap positions are optimized via least squares optimization of the graph of constraints, to achieve global alignment. The global vehicle trajectory is used for subsea sonar bathymetric map generation and for mesh reconstruction from lidar data for 3D visualization of above-water structures. We present experimental results in the vicinity of several structures spanning or along the Charles River between Boston and Cambridge, MA. The Harvard and Longfellow Bridges, three sailing pavilions and a yacht club provide structures of interest, having both extensive superstructure and subsurface foundations. To quantitatively assess the mapping error, we compare against a georeferenced model of the Harvard Bridge using blueprints from the Library of Congress. Our results demonstrate the potential of this new approach to achieve robust and efficient model capture for complex shallow-water marine environments. Future work aims to incorporate autonomy for path planning of a region of interest while performing collision avoidance to enable fully autonomous surveys that achieve full sensor coverage of a complete marine environment.by Jacques Chadwick Leedekerken.Ph.D

    User-oriented markerless augmented reality framework based on 3D reconstruction and loop closure detection

    Get PDF
    An augmented reality (AR) system needs to track the user-view to perform an accurate augmentation registration. The present research proposes a conceptual marker-less, natural feature-based AR framework system, the process for which is divided into two stages - an offline database training session for the application developers, and an online AR tracking and display session for the final users. In the offline session, two types of 3D reconstruction application, RGBD-SLAM and SfM are integrated into the development framework for building the reference template of a target environment. The performance and applicable conditions of these two methods are presented in the present thesis, and the application developers can choose which method to apply for their developmental demands. A general developmental user interface is provided to the developer for interaction, including a simple GUI tool for augmentation configuration. The present proposal also applies a Bag of Words strategy to enable a rapid "loop-closure detection" in the online session, for efficiently querying the application user-view from the trained database to locate the user pose. The rendering and display process of augmentation is currently implemented within an OpenGL window, which is one result of the research that is worthy of future detailed investigation and development
    • …
    corecore