160 research outputs found

    Perception of Unstructured Environments for Autonomous Off-Road Vehicles

    Get PDF
    Autonome Fahrzeuge benötigen die FĂ€higkeit zur Perzeption als eine notwendige Voraussetzung fĂŒr eine kontrollierbare und sichere Interaktion, um ihre Umgebung wahrzunehmen und zu verstehen. Perzeption fĂŒr strukturierte Innen- und Außenumgebungen deckt wirtschaftlich lukrative Bereiche, wie den autonomen Personentransport oder die Industrierobotik ab, wĂ€hrend die Perzeption unstrukturierter Umgebungen im Forschungsfeld der Umgebungswahrnehmung stark unterreprĂ€sentiert ist. Die analysierten unstrukturierten Umgebungen stellen eine besondere Herausforderung dar, da die vorhandenen, natĂŒrlichen und gewachsenen Geometrien meist keine homogene Struktur aufweisen und Ă€hnliche Texturen sowie schwer zu trennende Objekte dominieren. Dies erschwert die Erfassung dieser Umgebungen und deren Interpretation, sodass Perzeptionsmethoden speziell fĂŒr diesen Anwendungsbereich konzipiert und optimiert werden mĂŒssen. In dieser Dissertation werden neuartige und optimierte Perzeptionsmethoden fĂŒr unstrukturierte Umgebungen vorgeschlagen und in einer ganzheitlichen, dreistufigen Pipeline fĂŒr autonome GelĂ€ndefahrzeuge kombiniert: Low-Level-, Mid-Level- und High-Level-Perzeption. Die vorgeschlagenen klassischen Methoden und maschinellen Lernmethoden (ML) zur Perzeption bzw.~Wahrnehmung ergĂ€nzen sich gegenseitig. DarĂŒber hinaus ermöglicht die Kombination von Perzeptions- und Validierungsmethoden fĂŒr jede Ebene eine zuverlĂ€ssige Wahrnehmung der möglicherweise unbekannten Umgebung, wobei lose und eng gekoppelte Validierungsmethoden kombiniert werden, um eine ausreichende, aber flexible Bewertung der vorgeschlagenen Perzeptionsmethoden zu gewĂ€hrleisten. Alle Methoden wurden als einzelne Module innerhalb der in dieser Arbeit vorgeschlagenen Perzeptions- und Validierungspipeline entwickelt, und ihre flexible Kombination ermöglicht verschiedene Pipelinedesigns fĂŒr eine Vielzahl von GelĂ€ndefahrzeugen und AnwendungsfĂ€llen je nach Bedarf. Low-Level-Perzeption gewĂ€hrleistet eine eng gekoppelte Konfidenzbewertung fĂŒr rohe 2D- und 3D-Sensordaten, um SensorausfĂ€lle zu erkennen und eine ausreichende Genauigkeit der Sensordaten zu gewĂ€hrleisten. DarĂŒber hinaus werden neuartige Kalibrierungs- und RegistrierungsansĂ€tze fĂŒr Multisensorsysteme in der Perzeption vorgestellt, welche lediglich die Struktur der Umgebung nutzen, um die erfassten Sensordaten zu registrieren: ein halbautomatischer Registrierungsansatz zur Registrierung mehrerer 3D~Light Detection and Ranging (LiDAR) Sensoren und ein vertrauensbasiertes Framework, welches verschiedene Registrierungsmethoden kombiniert und die Registrierung verschiedener Sensoren mit unterschiedlichen Messprinzipien ermöglicht. Dabei validiert die Kombination mehrerer Registrierungsmethoden die Registrierungsergebnisse in einer eng gekoppelten Weise. Mid-Level-Perzeption ermöglicht die 3D-Rekonstruktion unstrukturierter Umgebungen mit zwei Verfahren zur SchĂ€tzung der DisparitĂ€t von Stereobildern: ein klassisches, korrelationsbasiertes Verfahren fĂŒr Hyperspektralbilder, welches eine begrenzte Menge an Test- und Validierungsdaten erfordert, und ein zweites Verfahren, welches die DisparitĂ€t aus Graustufenbildern mit neuronalen Faltungsnetzen (CNNs) schĂ€tzt. Neuartige DisparitĂ€tsfehlermetriken und eine Evaluierungs-Toolbox fĂŒr die 3D-Rekonstruktion von Stereobildern ergĂ€nzen die vorgeschlagenen Methoden zur DisparitĂ€tsschĂ€tzung aus Stereobildern und ermöglichen deren lose gekoppelte Validierung. High-Level-Perzeption konzentriert sich auf die Interpretation von einzelnen 3D-Punktwolken zur Befahrbarkeitsanalyse, Objekterkennung und Hindernisvermeidung. Eine DomĂ€nentransferanalyse fĂŒr State-of-the-art-Methoden zur semantischen 3D-Segmentierung liefert Empfehlungen fĂŒr eine möglichst exakte Segmentierung in neuen ZieldomĂ€nen ohne eine Generierung neuer Trainingsdaten. Der vorgestellte Trainingsansatz fĂŒr 3D-Segmentierungsverfahren mit CNNs kann die benötigte Menge an Trainingsdaten weiter reduzieren. Methoden zur ErklĂ€rbarkeit kĂŒnstlicher Intelligenz vor und nach der Modellierung ermöglichen eine lose gekoppelte Validierung der vorgeschlagenen High-Level-Methoden mit Datensatzbewertung und modellunabhĂ€ngigen ErklĂ€rungen fĂŒr CNN-Vorhersagen. Altlastensanierung und MilitĂ€rlogistik sind die beiden HauptanwendungsfĂ€lle in unstrukturierten Umgebungen, welche in dieser Arbeit behandelt werden. Diese Anwendungsszenarien zeigen auch, wie die LĂŒcke zwischen der Entwicklung einzelner Methoden und ihrer Integration in die Verarbeitungskette fĂŒr autonome GelĂ€ndefahrzeuge mit Lokalisierung, Kartierung, Planung und Steuerung geschlossen werden kann. Zusammenfassend lĂ€sst sich sagen, dass die vorgeschlagene Pipeline flexible Perzeptionslösungen fĂŒr autonome GelĂ€ndefahrzeuge bietet und die begleitende Validierung eine exakte und vertrauenswĂŒrdige Perzeption unstrukturierter Umgebungen gewĂ€hrleistet

    Safe navigation and human-robot interaction in assistant robotic applications

    Get PDF
    L'abstract Ăš presente nell'allegato / the abstract is in the attachmen

    Towards Efficient 3D Reconstructions from High-Resolution Satellite Imagery

    Get PDF
    Recent years have witnessed the rapid growth of commercial satellite imagery. Compared with other imaging products, such as aerial or streetview imagery, modern satellite images are captured at high resolution and with multiple spectral bands, thus provide unique viewing angles, global coverage, and frequent updates of the Earth surfaces. With automated processing and intelligent analysis algorithms, satellite images can enable global-scale 3D modeling applications. This dissertation explores computer vision algorithms to reconstruct 3D models from satellite images at different levels: geometric, semantic, and parametric reconstructions. However, reconstructing satellite imagery is particularly challenging for the following reasons: 1) Satellite images typically contain an enormous amount of raw pixels. Efficient algorithms are needed to minimize the substantial computational burden. 2) The ground sampling distances of satellite images are comparatively low. Visual entities, such as buildings, appear visually small and cluttered, thus posing difficulties for 3D modeling. 3) Satellite images usually have complex camera models and inaccurate vendor-provided camera calibrations. Rational polynomial coefficients (RPC) camera models, although widely used, need to be appropriately handled to ensure high-quality reconstructions. To obtain geometric reconstructions efficiently, we propose an edge-aware interpolation-based algorithm to obtain 3D point clouds from satellite image pairs. Initial 2D pixel matches are first established and triangulated to compensate the RPC calibration errors. Noisy dense correspondences can then be estimated by interpolating the inlier matches in an edge-aware manner. After refining the correspondence map with a fast bilateral solver, we can obtain dense 3D point clouds via triangulation. Pixel-wise semantic classification results for satellite images are usually noisy due to the negligence of spatial neighborhood information. Thus, we propose to aggregate multiple corresponding observations of the same 3D point to obtain high-quality semantic models. Instead of just leveraging geometric reconstructions to provide such correspondences, we formulate geometric modeling and semantic reasoning in a joint Markov Random Field (MRF) model. Our experiments show that both tasks can benefit from the joint inference. Finally, we propose a novel deep learning based approach to perform single-view parametric reconstructions from satellite imagery. By parametrizing buildings as 3D cuboids, our method simultaneously localizes building instances visible in the image and estimates their corresponding cuboid models. Aerial LiDAR and vectorized GIS maps are utilized as supervision. Our network upsamples CNN features to detect small but cluttered building instances. In addition, we estimate building contours through a separate fully convolutional network to avoid overlapping building cuboids.Doctor of Philosoph

    Toward deep monocular view generation and omnidirectional depth estimation

    Get PDF
    This thesis proposes new strategies for obtaining environmental depth representations from monocular perspective and omnidirectional vision. This research is inspired by the necessity for mobile autonomous systems to be able to sense their surroundings, which is frequently abundant in vital data necessary for planning, decision-making and action. The methodologies presented here are primarily data-driven and based on machine learning, specifically deep learning. Our first contribution is the generation of top-down, “bird’s eye view” representations of detected vehicles in a scene. This was achieved using only monocular, perspective view images. The novelty here was via an adversarial training scheme, which our experiments showed resulted in more robust models versus a strictly supervised baseline. Our second contribution is a novel method for adapting view synthesis-based depth estimation models to omnidirectional imagery. Our proposal comprise three important facets. Firstly, a "virtual" spherical camera model is integrated into the training pipeline to facilitate model training. Secondly, we explicitly encode information of the spherical nature of the image format by adopting spherical convolutional layers to perform convolution operations, consequently compensating for the significant distortion. Thirdly, we propose an optical flow-based masking strategy to reduce the impact of undesired pixels during training, such as those originating from large, challenging visual areas of the image such as the sky. Our qualitative and quantitative findings indicate that these additions result in improved depth estimations versus earlier methods. Our final contribution, broadly, is a method for incorporating LiDAR information into the training pipeline of an omnidirectional depth estimation model. We introduce a Bayesian optimisation-based extrinsic calibration method to match LiDAR returns with equirectangular images. Primarily, we weight the incorporation of this data via a frequency-based scheme dependent on the number of detected LiDAR projections. The results from this show that there is a tangible quantitative benefit in doing the aforementioned

    Augmented Reality and Artificial Intelligence in Image-Guided and Robot-Assisted Interventions

    Get PDF
    In minimally invasive orthopedic procedures, the surgeon places wires, screws, and surgical implants through the muscles and bony structures under image guidance. These interventions require alignment of the pre- and intra-operative patient data, the intra-operative scanner, surgical instruments, and the patient. Suboptimal interaction with patient data and challenges in mastering 3D anatomy based on ill-posed 2D interventional images are essential concerns in image-guided therapies. State of the art approaches often support the surgeon by using external navigation systems or ill-conditioned image-based registration methods that both have certain drawbacks. Augmented reality (AR) has been introduced in the operating rooms in the last decade; however, in image-guided interventions, it has often only been considered as a visualization device improving traditional workflows. Consequently, the technology is gaining minimum maturity that it requires to redefine new procedures, user interfaces, and interactions. This dissertation investigates the applications of AR, artificial intelligence, and robotics in interventional medicine. Our solutions were applied in a broad spectrum of problems for various tasks, namely improving imaging and acquisition, image computing and analytics for registration and image understanding, and enhancing the interventional visualization. The benefits of these approaches were also discovered in robot-assisted interventions. We revealed how exemplary workflows are redefined via AR by taking full advantage of head-mounted displays when entirely co-registered with the imaging systems and the environment at all times. The proposed AR landscape is enabled by co-localizing the users and the imaging devices via the operating room environment and exploiting all involved frustums to move spatial information between different bodies. The system's awareness of the geometric and physical characteristics of X-ray imaging allows the exploration of different human-machine interfaces. We also leveraged the principles governing image formation and combined it with deep learning and RGBD sensing to fuse images and reconstruct interventional data. We hope that our holistic approaches towards improving the interface of surgery and enhancing the usability of interventional imaging, not only augments the surgeon's capabilities but also augments the surgical team's experience in carrying out an effective intervention with reduced complications

    Kimera: from SLAM to Spatial Perception with 3D Dynamic Scene Graphs

    Full text link
    Humans are able to form a complex mental model of the environment they move in. This mental model captures geometric and semantic aspects of the scene, describes the environment at multiple levels of abstractions (e.g., objects, rooms, buildings), includes static and dynamic entities and their relations (e.g., a person is in a room at a given time). In contrast, current robots' internal representations still provide a partial and fragmented understanding of the environment, either in the form of a sparse or dense set of geometric primitives (e.g., points, lines, planes, voxels) or as a collection of objects. This paper attempts to reduce the gap between robot and human perception by introducing a novel representation, a 3D Dynamic Scene Graph(DSG), that seamlessly captures metric and semantic aspects of a dynamic environment. A DSG is a layered graph where nodes represent spatial concepts at different levels of abstraction, and edges represent spatio-temporal relations among nodes. Our second contribution is Kimera, the first fully automatic method to build a DSG from visual-inertial data. Kimera includes state-of-the-art techniques for visual-inertial SLAM, metric-semantic 3D reconstruction, object localization, human pose and shape estimation, and scene parsing. Our third contribution is a comprehensive evaluation of Kimera in real-life datasets and photo-realistic simulations, including a newly released dataset, uHumans2, which simulates a collection of crowded indoor and outdoor scenes. Our evaluation shows that Kimera achieves state-of-the-art performance in visual-inertial SLAM, estimates an accurate 3D metric-semantic mesh model in real-time, and builds a DSG of a complex indoor environment with tens of objects and humans in minutes. Our final contribution shows how to use a DSG for real-time hierarchical semantic path-planning. The core modules in Kimera are open-source.Comment: 34 pages, 25 figures, 9 tables. arXiv admin note: text overlap with arXiv:2002.0628

    Advances in Multi-User Scheduling and Turbo Equalization for Wireless MIMO Systems

    Get PDF
    Nach einer Einleitung behandelt Teil 2 Mehrbenutzer-Scheduling fĂŒr die AbwĂ€rtsstrecke von drahtlosen MIMO Systemen mit einer Sendestation und kanaladaptivem precoding: In jeder Zeit- oder Frequenzressource kann eine andere Nutzergruppe gleichzeitig bedient werden, rĂ€umlich getrennt durch unterschiedliche Antennengewichte. Nutzer mit korrelierten KanĂ€len sollten nicht gleichzeitig bedient werden, da dies die rĂ€umliche Trennbarkeit erschwert. Die Summenrate einer Nutzermenge hĂ€ngt von den Antennengewichten ab, die wiederum von der Nutzerauswahl abhĂ€ngen. Zur Entkopplung des Problems schlĂ€gt diese Arbeit Metriken vor basierend auf einer geschĂ€tzten Rate mit ZF precoding. Diese lĂ€sst sich mit Hilfe von wiederholten orthogonalen Projektionen abschĂ€tzen, wodurch die Berechnung von Antennengewichten beim Scheduling entfĂ€llt. Die RatenschĂ€tzung kann basierend auf momentanen Kanalmessungen oder auf gemittelter Kanalkenntnis berechnet werden und es können Datenraten- und Fairness-Kriterien berĂŒcksichtig werden. Effiziente Suchalgorithmen werden vorgestellt, die die gesamte Systembandbreite auf einmal bearbeiten können und zur KomplexitĂ€tsreduktion die Lösung in Zeit- und Frequenz nachfĂŒhren können. Teil 3 zeigt wie mehrere Sendestationen koordiniertes Scheduling und kooperative Signalverarbeitung einsetzen können. Mittels orthogonalen Projektionen ist es möglich, Inter-Site Interferenz zu schĂ€tzen, ohne Antennengewichte berechnen zu mĂŒssen. Durch ein Konzept virtueller Nutzer kann der obige Scheduling-Ansatz auf mehrere Sendestationen und sogar Relays mit SDMA erweitert werden. Auf den benötigten Signalisierungsaufwand wird kurz eingegangen und eine Methode zur SchĂ€tzung der Summenrate eines Systems ohne Koordination besprochen. Teil4 entwickelt Optimierungen fĂŒr Turbo Entzerrer. Diese Nutzen Signalkorrelation als Quelle von Redundanz. Trotzdem kann eine Kombination mit MIMO precoding sinnvoll sein, da bei Annahme realistischer Fehler in der Kanalkenntnis am Sender keine optimale InterferenzunterdrĂŒckung möglich ist. Mit Hilfe von EXIT Charts wird eine neuartige Methode zur adaptiven Nutzung von a-priori-Information zwischen Iterationen entwickelt, die die Konvergenz verbessert. Dabei wird gezeigt, wie man semi-blinde KanalschĂ€tzung im EXIT chart berĂŒcksichtigen kann. In Computersimulationen werden alle Verfahren basierend auf 4G-Systemparametern ĂŒberprĂŒft.After an introduction, part 2 of this thesis deals with downlink multi-user scheduling for wireless MIMO systems with one transmitting station performing channel adaptive precoding:Different user subsets can be served in each time or frequency resource by separating them in space with different antenna weight vectors. Users with correlated channel matrices should not be served jointly since correlation impairs the spatial separability.The resulting sum rate for each user subset depends on the precoding weights, which in turn depend on the user subset. This thesis manages to decouple this problem by proposing a scheduling metric based on the rate with ZF precoding such as BD, written with the help of orthogonal projection matrices. It allows estimating rates without computing any antenna weights by using a repeated projection approximation.This rate estimate allows considering user rate requirements and fairness criteria and can work with either instantaneous or long term averaged channel knowledge.Search algorithms are presented to efficiently solve user grouping or selection problems jointly for the entire system bandwidth while being able to track the solution in time and frequency for complexity reduction. Part 3 shows how multiple transmitting stations can benefit from cooperative scheduling or joint signal processing. An orthogonal projection based estimate of the inter-site interference power, again without computing any antenna weights, and a virtual user concept extends the scheduling approach to cooperative base stations and finally included SDMA half-duplex relays in the scheduling.Signalling overhead is discussed and a method to estimate the sum rate without coordination. Part 4 presents optimizations for Turbo Equalizers. There, correlation between user signals can be exploited as a source of redundancy. Nevertheless a combination with transmit precoding which aims at reducing correlation can be beneficial when the channel knowledge at the transmitter contains a realistic error, leading to increased correlation. A novel method for adaptive re-use of a-priori information between is developed to increase convergence by tracking the iterations online with EXIT charts.A method is proposed to model semi-blind channel estimation updates in an EXIT chart. Computer simulations with 4G system parameters illustrate the methods using realistic channel models.Im Buchhandel erhĂ€ltlich: Advances in Multi-User Scheduling and Turbo Equalization for Wireless MIMO Systems / Fuchs-Lautensack,Martin Ilmenau: ISLE, 2009,116 S. ISBN 978-3-938843-43-

    Vision-Based Autonomous Robotic Floor Cleaning in Domestic Environments

    Get PDF
    Fleer DR. Vision-Based Autonomous Robotic Floor Cleaning in Domestic Environments. Bielefeld: UniversitÀt Bielefeld; 2018
    • 

    corecore