250 research outputs found

    Kimera2: Robust and Accurate Metric-Semantic SLAM in the Real World

    Full text link
    We present improvements to Kimera, an open-source metric-semantic visual-inertial SLAM library. In particular, we enhance Kimera-VIO, the visual-inertial odometry pipeline powering Kimera, to support better feature tracking, more efficient keyframe selection, and various input modalities (eg monocular, stereo, and RGB-D images, as well as wheel odometry). Additionally, Kimera-RPGO and Kimera-PGMO, Kimera's pose-graph optimization backends, are updated to support modern outlier rejection methods - specifically, Graduated-Non-Convexity - for improved robustness to spurious loop closures. These new features are evaluated extensively on a variety of simulated and real robotic platforms, including drones, quadrupeds, wheeled robots, and simulated self-driving cars. We present comparisons against several state-of-the-art visual-inertial SLAM pipelines and discuss strengths and weaknesses of the new release of Kimera. The newly added features have been released open-source at https://github.com/MIT-SPARK/Kimera.Comment: Presented at ISER 202

    Integrasjon av et minimalistisk sett av sensorer for kartlegging og lokalisering av landbruksroboter

    Get PDF
    Robots have recently become ubiquitous in many aspects of daily life. For in-house applications there is vacuuming, mopping and lawn-mowing robots. Swarms of robots have been used in Amazon warehouses for several years. Autonomous driving cars, despite being set back by several safety issues, are undeniably becoming the standard of the automobile industry. Not just being useful for commercial applications, robots can perform various tasks, such as inspecting hazardous sites, taking part in search-and-rescue missions. Regardless of end-user applications, autonomy plays a crucial role in modern robots. The essential capabilities required for autonomous operations are mapping, localization and navigation. The goal of this thesis is to develop a new approach to solve the problems of mapping, localization, and navigation for autonomous robots in agriculture. This type of environment poses some unique challenges such as repetitive patterns, large-scale sparse features environments, in comparison to other scenarios such as urban/cities, where the abundance of good features such as pavements, buildings, road lanes, traffic signs, etc., exists. In outdoor agricultural environments, a robot can rely on a Global Navigation Satellite System (GNSS) to determine its whereabouts. It is often limited to the robot's activities to accessible GNSS signal areas. It would fail for indoor environments. In this case, different types of exteroceptive sensors such as (RGB, Depth, Thermal) cameras, laser scanner, Light Detection and Ranging (LiDAR) and proprioceptive sensors such as Inertial Measurement Unit (IMU), wheel-encoders can be fused to better estimate the robot's states. Generic approaches of combining several different sensors often yield superior estimation results but they are not always optimal in terms of cost-effectiveness, high modularity, reusability, and interchangeability. For agricultural robots, it is equally important for being robust for long term operations as well as being cost-effective for mass production. We tackle this challenge by exploring and selectively using a handful of sensors such as RGB-D cameras, LiDAR and IMU for representative agricultural environments. The sensor fusion algorithms provide high precision and robustness for mapping and localization while at the same time assuring cost-effectiveness by employing only the necessary sensors for a task at hand. In this thesis, we extend the LiDAR mapping and localization methods for normal urban/city scenarios to cope with the agricultural environments where the presence of slopes, vegetation, trees render the traditional approaches to fail. Our mapping method substantially reduces the memory footprint for map storing, which is important for large-scale farms. We show how to handle the localization problem in dynamic growing strawberry polytunnels by using only a stereo visual-inertial (VI) and depth sensor to extract and track only invariant features. This eliminates the need for remapping to deal with dynamic scenes. Also, for a demonstration of the minimalistic requirement for autonomous agricultural robots, we show the ability to autonomously traverse between rows in a difficult environment of zigzag-liked polytunnel using only a laser scanner. Furthermore, we present an autonomous navigation capability by using only a camera without explicitly performing mapping or localization. Finally, our mapping and localization methods are generic and platform-agnostic, which can be applied to different types of agricultural robots. All contributions presented in this thesis have been tested and validated on real robots in real agricultural environments. All approaches have been published or submitted in peer-reviewed conference papers and journal articles.Roboter har nylig blitt standard i mange deler av hverdagen. I hjemmet har vi støvsuger-, vaske- og gressklippende roboter. Svermer med roboter har blitt brukt av Amazons varehus i mange år. Autonome selvkjørende biler, til tross for å ha vært satt tilbake av sikkerhetshensyn, er udiskutabelt på vei til å bli standarden innen bilbransjen. Roboter har mer nytte enn rent kommersielt bruk. Roboter kan utføre forskjellige oppgaver, som å inspisere farlige områder og delta i leteoppdrag. Uansett hva sluttbrukeren velger å gjøre, spiller autonomi en viktig rolle i moderne roboter. De essensielle egenskapene for autonome operasjoner i landbruket er kartlegging, lokalisering og navigering. Denne type miljø gir spesielle utfordringer som repetitive mønstre og storskala miljø med få landskapsdetaljer, sammenlignet med andre steder, som urbane-/bymiljø, hvor det finnes mange landskapsdetaljer som fortau, bygninger, trafikkfelt, trafikkskilt, etc. I utendørs jordbruksmiljø kan en robot bruke Global Navigation Satellite System (GNSS) til å navigere sine omgivelser. Dette begrenser robotens aktiviteter til områder med tilgjengelig GNSS signaler. Dette vil ikke fungere i miljøer innendørs. I ett slikt tilfelle vil reseptorer mot det eksterne miljø som (RGB-, dybde-, temperatur-) kameraer, laserskannere, «Light detection and Ranging» (LiDAR) og propriopsjonære detektorer som treghetssensorer (IMU) og hjulenkodere kunne brukes sammen for å bedre kunne estimere robotens tilstand. Generisk kombinering av forskjellige sensorer fører til overlegne estimeringsresultater, men er ofte suboptimale med hensyn på kostnadseffektivitet, moduleringingsgrad og utbyttbarhet. For landbruksroboter så er det like viktig med robusthet for lang tids bruk som kostnadseffektivitet for masseproduksjon. Vi taklet denne utfordringen med å utforske og selektivt velge en håndfull sensorer som RGB-D kameraer, LiDAR og IMU for representative landbruksmiljø. Algoritmen som kombinerer sensorsignalene gir en høy presisjonsgrad og robusthet for kartlegging og lokalisering, og gir samtidig kostnadseffektivitet med å bare bruke de nødvendige sensorene for oppgaven som skal utføres. I denne avhandlingen utvider vi en LiDAR kartlegging og lokaliseringsmetode normalt brukt i urbane/bymiljø til å takle landbruksmiljø, hvor hellinger, vegetasjon og trær gjør at tradisjonelle metoder mislykkes. Vår metode reduserer signifikant lagringsbehovet for kartlagring, noe som er viktig for storskala gårder. Vi viser hvordan lokaliseringsproblemet i dynamisk voksende jordbær-polytuneller kan løses ved å bruke en stereo visuel inertiel (VI) og en dybdesensor for å ekstrahere statiske objekter. Dette eliminerer behovet å kartlegge på nytt for å klare dynamiske scener. I tillegg demonstrerer vi de minimalistiske kravene for autonome jordbruksroboter. Vi viser robotens evne til å bevege seg autonomt mellom rader i ett vanskelig miljø med polytuneller i sikksakk-mønstre ved bruk av kun en laserskanner. Videre presenterer vi en autonom navigeringsevne ved bruk av kun ett kamera uten å eksplisitt kartlegge eller lokalisere. Til slutt viser vi at kartleggings- og lokaliseringsmetodene er generiske og platform-agnostiske, noe som kan brukes med flere typer jordbruksroboter. Alle bidrag presentert i denne avhandlingen har blitt testet og validert med ekte roboter i ekte landbruksmiljø. Alle forsøk har blitt publisert eller sendt til fagfellevurderte konferansepapirer og journalartikler

    Simultaneous Localization and Mapping (SLAM) for Autonomous Driving: Concept and Analysis

    Get PDF
    The Simultaneous Localization and Mapping (SLAM) technique has achieved astonishing progress over the last few decades and has generated considerable interest in the autonomous driving community. With its conceptual roots in navigation and mapping, SLAM outperforms some traditional positioning and localization techniques since it can support more reliable and robust localization, planning, and controlling to meet some key criteria for autonomous driving. In this study the authors first give an overview of the different SLAM implementation approaches and then discuss the applications of SLAM for autonomous driving with respect to different driving scenarios, vehicle system components and the characteristics of the SLAM approaches. The authors then discuss some challenging issues and current solutions when applying SLAM for autonomous driving. Some quantitative quality analysis means to evaluate the characteristics and performance of SLAM systems and to monitor the risk in SLAM estimation are reviewed. In addition, this study describes a real-world road test to demonstrate a multi-sensor-based modernized SLAM procedure for autonomous driving. The numerical results show that a high-precision 3D point cloud map can be generated by the SLAM procedure with the integration of Lidar and GNSS/INS. Online four–five cm accuracy localization solution can be achieved based on this pre-generated map and online Lidar scan matching with a tightly fused inertial system

    Kimera: from SLAM to Spatial Perception with 3D Dynamic Scene Graphs

    Full text link
    Humans are able to form a complex mental model of the environment they move in. This mental model captures geometric and semantic aspects of the scene, describes the environment at multiple levels of abstractions (e.g., objects, rooms, buildings), includes static and dynamic entities and their relations (e.g., a person is in a room at a given time). In contrast, current robots' internal representations still provide a partial and fragmented understanding of the environment, either in the form of a sparse or dense set of geometric primitives (e.g., points, lines, planes, voxels) or as a collection of objects. This paper attempts to reduce the gap between robot and human perception by introducing a novel representation, a 3D Dynamic Scene Graph(DSG), that seamlessly captures metric and semantic aspects of a dynamic environment. A DSG is a layered graph where nodes represent spatial concepts at different levels of abstraction, and edges represent spatio-temporal relations among nodes. Our second contribution is Kimera, the first fully automatic method to build a DSG from visual-inertial data. Kimera includes state-of-the-art techniques for visual-inertial SLAM, metric-semantic 3D reconstruction, object localization, human pose and shape estimation, and scene parsing. Our third contribution is a comprehensive evaluation of Kimera in real-life datasets and photo-realistic simulations, including a newly released dataset, uHumans2, which simulates a collection of crowded indoor and outdoor scenes. Our evaluation shows that Kimera achieves state-of-the-art performance in visual-inertial SLAM, estimates an accurate 3D metric-semantic mesh model in real-time, and builds a DSG of a complex indoor environment with tens of objects and humans in minutes. Our final contribution shows how to use a DSG for real-time hierarchical semantic path-planning. The core modules in Kimera are open-source.Comment: 34 pages, 25 figures, 9 tables. arXiv admin note: text overlap with arXiv:2002.0628

    Stereo Visual Odometry and Semantics based Localization of Aerial Robots in Indoor Environments

    Get PDF
    In this paper we propose a particle filter localization approach, based on stereo visual odometry (VO) and semantic information from indoor environments, for mini-aerial robots. The prediction stage of the particle filter is performed using the 3D pose of the aerial robot estimated by the stereo VO algorithm. This predicted 3D pose is updated using inertial as well as semantic measurements. The algorithm processes semantic measurements in two phases; firstly, a pre-trained deep learning (DL) based object detector is used for real time object detections in the RGB spectrum. Secondly, from the corresponding 3D point clouds of the detected objects, we segment their dominant horizontal plane and estimate their relative position, also augmenting a prior map with new detections. The augmented map is then used in order to obtain a drift free pose estimate of the aerial robot. We validate our approach in several real flight experiments where we compare it against ground truth and a state of the art visual SLAM approach

    A survey on real-time 3D scene reconstruction with SLAM methods in embedded systems

    Full text link
    The 3D reconstruction of simultaneous localization and mapping (SLAM) is an important topic in the field for transport systems such as drones, service robots and mobile AR/VR devices. Compared to a point cloud representation, the 3D reconstruction based on meshes and voxels is particularly useful for high-level functions, like obstacle avoidance or interaction with the physical environment. This article reviews the implementation of a visual-based 3D scene reconstruction pipeline on resource-constrained hardware platforms. Real-time performances, memory management and low power consumption are critical for embedded systems. A conventional SLAM pipeline from sensors to 3D reconstruction is described, including the potential use of deep learning. The implementation of advanced functions with limited resources is detailed. Recent systems propose the embedded implementation of 3D reconstruction methods with different granularities. The trade-off between required accuracy and resource consumption for real-time localization and reconstruction is one of the open research questions identified and discussed in this paper

    Semantic Visual Localization

    Full text link
    Robust visual localization under a wide range of viewing conditions is a fundamental problem in computer vision. Handling the difficult cases of this problem is not only very challenging but also of high practical relevance, e.g., in the context of life-long localization for augmented reality or autonomous robots. In this paper, we propose a novel approach based on a joint 3D geometric and semantic understanding of the world, enabling it to succeed under conditions where previous approaches failed. Our method leverages a novel generative model for descriptor learning, trained on semantic scene completion as an auxiliary task. The resulting 3D descriptors are robust to missing observations by encoding high-level 3D geometric and semantic information. Experiments on several challenging large-scale localization datasets demonstrate reliable localization under extreme viewpoint, illumination, and geometry changes

    Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age

    Get PDF
    Simultaneous Localization and Mapping (SLAM)consists in the concurrent construction of a model of the environment (the map), and the estimation of the state of the robot moving within it. The SLAM community has made astonishing progress over the last 30 years, enabling large-scale real-world applications, and witnessing a steady transition of this technology to industry. We survey the current state of SLAM. We start by presenting what is now the de-facto standard formulation for SLAM. We then review related work, covering a broad set of topics including robustness and scalability in long-term mapping, metric and semantic representations for mapping, theoretical performance guarantees, active SLAM and exploration, and other new frontiers. This paper simultaneously serves as a position paper and tutorial to those who are users of SLAM. By looking at the published research with a critical eye, we delineate open challenges and new research issues, that still deserve careful scientific investigation. The paper also contains the authors' take on two questions that often animate discussions during robotics conferences: Do robots need SLAM? and Is SLAM solved
    corecore