72 research outputs found
Fast 3D Perception for Collision Avoidance and SLAM in Domestic Environments
Electronics engineerin
Chapter Fast 3D Perception for Collision Avoidance and SLAM in Domestic Environments
Electronics engineerin
Indoor Mapping and Reconstruction with Mobile Augmented Reality Sensor Systems
Augmented Reality (AR) ermöglicht es, virtuelle, dreidimensionale Inhalte direkt
innerhalb der realen Umgebung darzustellen. Anstatt jedoch beliebige virtuelle
Objekte an einem willkĂĽrlichen Ort anzuzeigen, kann AR Technologie auch genutzt
werden, um Geodaten in situ an jenem Ort darzustellen, auf den sich die Daten
beziehen. Damit eröffnet AR die Möglichkeit, die reale Welt durch virtuelle, ortbezogene
Informationen anzureichern. Im Rahmen der vorliegenen Arbeit wird diese
Spielart von AR als "Fused Reality" definiert und eingehend diskutiert.
Der praktische Mehrwert, den dieses Konzept der Fused Reality bietet, lässt sich
gut am Beispiel seiner Anwendung im Zusammenhang mit digitalen Gebäudemodellen
demonstrieren, wo sich gebäudespezifische Informationen - beispielsweise der
Verlauf von Leitungen und Kabeln innerhalb der Wände - lagegerecht am realen
Objekt darstellen lassen. Um das skizzierte Konzept einer Indoor Fused Reality
Anwendung realisieren zu können, müssen einige grundlegende Bedingungen erfüllt
sein. So kann ein bestimmtes Gebäude nur dann mit ortsbezogenen Informationen
augmentiert werden, wenn von diesem Gebäude ein digitales Modell verfügbar ist.
Zwar werden größere Bauprojekt heutzutage oft unter Zuhilfename von Building
Information Modelling (BIM) geplant und durchgefĂĽhrt, sodass ein digitales Modell
direkt zusammen mit dem realen Gebäude ensteht, jedoch sind im Falle älterer
Bestandsgebäude digitale Modelle meist nicht verfügbar. Ein digitales Modell eines
bestehenden Gebäudes manuell zu erstellen, ist zwar möglich, jedoch mit großem
Aufwand verbunden. Ist ein passendes Gebäudemodell vorhanden, muss ein AR
Gerät außerdem in der Lage sein, die eigene Position und Orientierung im Gebäude
relativ zu diesem Modell bestimmen zu können, um Augmentierungen lagegerecht
anzeigen zu können.
Im Rahmen dieser Arbeit werden diverse Aspekte der angesprochenen Problematik
untersucht und diskutiert. Dabei werden zunächst verschiedene Möglichkeiten
diskutiert, Indoor-Gebäudegeometrie mittels Sensorsystemen zu erfassen. Anschließend
wird eine Untersuchung präsentiert, inwiefern moderne AR Geräte, die
in der Regel ebenfalls ĂĽber eine Vielzahl an Sensoren verfĂĽgen, ebenfalls geeignet
sind, als Indoor-Mapping-Systeme eingesetzt zu werden. Die resultierenden Indoor
Mapping Datensätze können daraufhin genutzt werden, um automatisiert
Gebäudemodelle zu rekonstruieren. Zu diesem Zweck wird ein automatisiertes,
voxel-basiertes Indoor-Rekonstruktionsverfahren vorgestellt. Dieses wird auĂźerdem
auf der Grundlage vierer zu diesem Zweck erfasster Datensätze mit zugehörigen
Referenzdaten quantitativ evaluiert. Desweiteren werden verschiedene
Möglichkeiten diskutiert, mobile AR Geräte innerhalb eines Gebäudes und des zugehörigen
Gebäudemodells zu lokalisieren. In diesem Kontext wird außerdem auch
die Evaluierung einer Marker-basierten Indoor-Lokalisierungsmethode präsentiert.
Abschließend wird zudem ein neuer Ansatz, Indoor-Mapping Datensätze an den
Achsen des Koordinatensystems auszurichten, vorgestellt
Lidar-based scene understanding for autonomous driving using deep learning
With over 1.35 million fatalities related to traffic accidents worldwide, autonomous driving was foreseen at the beginning of this century as a feasible solution to improve security in our roads. Nevertheless, it is meant to disrupt our transportation paradigm, allowing to reduce congestion, pollution, and costs, while increasing the accessibility, efficiency, and reliability of the transportation for both people and goods. Although some advances have gradually been transferred into commercial vehicles in the way of Advanced Driving Assistance Systems (ADAS) such as adaptive cruise control, blind spot detection or automatic parking, however, the technology is far from mature. A full understanding of the scene is actually needed so that allowing the vehicles to be aware of the surroundings, knowing the existing elements of the scene, as well as their motion, intentions and interactions.
In this PhD dissertation, we explore new approaches for understanding driving scenes from 3D LiDAR point clouds by using Deep Learning methods. To this end, in Part I we analyze the scene from a static perspective using independent frames to detect the neighboring vehicles. Next, in Part II we develop new ways for understanding the dynamics of the scene. Finally, in Part III we apply all the developed methods to accomplish higher level challenges such as segmenting moving obstacles while obtaining their rigid motion vector over the ground.
More specifically, in Chapter 2 we develop a 3D vehicle detection pipeline based on a multi-branch deep-learning architecture and propose a Front (FR-V) and a Bird’s Eye view (BE-V) as 2D representations of the 3D point cloud to serve as input for training our models. Later on, in Chapter 3 we apply and further test this method on two real uses-cases, for pre-filtering moving
obstacles while creating maps to better localize ourselves on subsequent days, as well as for vehicle tracking. From the dynamic perspective, in Chapter 4 we learn from the 3D point cloud a novel dynamic feature that resembles optical flow from RGB images. For that, we develop a new approach to leverage RGB optical flow as pseudo ground truth for training purposes but allowing the use of only 3D LiDAR data at inference time. Additionally, in Chapter 5 we explore the benefits of combining classification and regression learning problems to face the optical flow estimation task in a joint coarse-and-fine manner. Lastly, in Chapter 6 we gather the previous methods and demonstrate that with these independent tasks we can guide the learning of higher challenging problems such as segmentation and motion estimation of moving vehicles from our own moving perspective.Con más de 1,35 millones de muertes por accidentes de tráfico en el mundo, a principios de siglo se predijo que la conducciĂłn autĂłnoma serĂa una soluciĂłn viable para mejorar la seguridad en nuestras carreteras. Además la conducciĂłn autĂłnoma está destinada a cambiar nuestros paradigmas de transporte, permitiendo reducir la congestiĂłn del tráfico, la contaminaciĂłn y el coste, a la vez que aumentando la accesibilidad, la eficiencia y confiabilidad del transporte tanto de personas como de mercancĂas. Aunque algunos avances, como el control de crucero adaptativo, la detecciĂłn de puntos ciegos o el estacionamiento automático, se han transferido gradualmente a vehĂculos comerciales en la forma de los Sistemas Avanzados de Asistencia a la ConducciĂłn (ADAS), la tecnologĂa aĂşn no ha alcanzado el suficiente grado de madurez. Se necesita una comprensiĂłn completa de la escena para que los vehĂculos puedan entender el entorno, detectando los elementos presentes, asĂ como su movimiento, intenciones e interacciones. En la presente tesis doctoral, exploramos nuevos enfoques para comprender escenarios de conducciĂłn utilizando nubes de puntos en 3D capturadas con sensores LiDAR, para lo cual empleamos mĂ©todos de aprendizaje profundo. Con este fin, en la Parte I analizamos la escena desde una perspectiva estática para detectar vehĂculos. A continuaciĂłn, en la Parte II, desarrollamos nuevas formas de entender las dinámicas del entorno. Finalmente, en la Parte III aplicamos los mĂ©todos previamente desarrollados para lograr desafĂos de nivel superior, como segmentar obstáculos dinámicos a la vez que estimamos su vector de movimiento sobre el suelo. EspecĂficamente, en el CapĂtulo 2 detectamos vehĂculos en 3D creando una arquitectura de aprendizaje profundo de dos ramas y proponemos una vista frontal (FR-V) y una vista de pájaro (BE-V) como representaciones 2D de la nube de puntos 3D que sirven como entrada para entrenar nuestros modelos. Más adelante, en el CapĂtulo 3 aplicamos y probamos aĂşn más este mĂ©todo en dos casos de uso reales, tanto para filtrar obstáculos en movimiento previamente a la creaciĂłn de mapas sobre los que poder localizarnos mejor en los dĂas posteriores, como para el seguimiento de vehĂculos. Desde la perspectiva dinámica, en el CapĂtulo 4 aprendemos de la nube de puntos en 3D una caracterĂstica dinámica novedosa que se asemeja al flujo Ăłptico sobre imágenes RGB. Para ello, desarrollamos un nuevo enfoque que aprovecha el flujo Ăłptico RGB como pseudo muestras reales para entrenamiento, usando solo information 3D durante la inferencia. Además, en el CapĂtulo 5 exploramos los beneficios de combinar los aprendizajes de problemas de clasificaciĂłn y regresiĂłn para la tarea de estimaciĂłn de flujo Ăłptico de manera conjunta. Por Ăşltimo, en el CapĂtulo 6 reunimos los mĂ©todos anteriores y demostramos que con estas tareas independientes podemos guiar el aprendizaje de problemas de más alto nivel, como la segmentaciĂłn y estimaciĂłn del movimiento de vehĂculos desde nuestra propia perspectivaAmb mĂ©s d’1,35 milions de morts per accidents de trĂ nsit al mĂłn, a principis de segle es va
predir que la conducció autònoma es convertiria en una solució viable per millorar la seguretat
a les nostres carreteres. D’altra banda, la conducció autònoma està destinada a canviar els
paradigmes del transport, fent possible aixĂ reduir la densitat del trĂ nsit, la contaminaciĂł i
el cost, alhora que augmentant l’accessibilitat, l’eficiència i la confiança del transport tant de
persones com de mercaderies. Encara que alguns avenços, com el control de creuer adaptatiu,
la detecció de punts cecs o l’estacionament automà tic, s’han transferit gradualment a vehicles
comercials en forma de Sistemes Avançats d’Assistència a la Conducció (ADAS), la tecnologia
encara no ha arribat a aconseguir el grau suficient de maduresa. És necessà ria, doncs, una
total comprensió de l’escena de manera que els vehicles puguin entendre l’entorn, detectant els
elements presents, aixĂ com el seu moviment, intencions i interaccions.
A la present tesi doctoral, explorem nous enfocaments per tal de comprendre les diferents
escenes de conducció utilitzant núvols de punts en 3D capturats amb sensors LiDAR, mitjançant
l’ús de mètodes d’aprenentatge profund. Amb aquest objectiu, a la Part I analitzem l’escena des
d’una perspectiva està tica per a detectar vehicles. A continuació, a la Part II, desenvolupem
noves formes d’entendre les dinà miques de l’entorn. Finalment, a la Part III apliquem els
mètodes prèviament desenvolupats per a aconseguir desafiaments d’un nivell superior, com, per
exemple, segmentar obstacles dinĂ mics al mateix temps que estimem el seu vector de moviment
respecte al terra.
Concretament, al CapĂtol 2 detectem vehicles en 3D creant una arquitectura d’aprenentatge
profund amb dues branques, i proposem una vista frontal (FR-V) i una vista d’ocell (BE-V)
com a representacions 2D del nĂşvol de punts 3D que serveixen com a punt de partida per
entrenar els nostres models. MĂ©s endavant, al CapĂtol 3 apliquem i provem de nou aquest
mètode en dos casos d’ús reals, tant per filtrar obstacles en moviment prèviament a la creació
de mapes en els quals poder localitzar-nos millor en dies posteriors, com per dur a terme
el seguiment de vehicles. Des de la perspectiva dinĂ mica, al CapĂtol 4 aprenem una nova
caracterĂstica dinĂ mica del nĂşvol de punts en 3D que s’assembla al flux òptic sobre imatges
RGB. Per a fer-ho, desenvolupem un nou enfocament que aprofita el flux òptic RGB com pseudo
mostres reals per a entrenament, utilitzant només informació 3D durant la inferència. Després,
al CapĂtol 5 explorem els beneficis que s’obtenen de combinar els aprenentatges de problemes
de classificació i regressió per la tasca d’estimació de flux òptic de manera conjunta. Finalment,
al CapĂtol 6 posem en comĂş els mètodes anteriors i demostrem que mitjançant aquests processos
independents podem abordar l’aprenentatge de problemes més complexos, com la segmentació
i estimació del moviment de vehicles des de la nostra pròpia perspectiva
Development and testing of docking functions in industrial settings for an autonomous mobile robot based on ROS2
This dissertation is the result of a six-months internship at G.D S.p.A. for the preparation
of the thesis project. The final goal is to develop algorithms on the ROS2 framework that could be used to
control an Autonomous Mobile Robot during the operations of detection and approach of a docking station with high precision, needed to operate a recharge of the AMR itself or some operation on the host machines. The automation of these operations ensures a substantial increase in safety and
productivity within a warehouse or host machine lines since it permits to the AMR to work without requiring an operator for longer time or even to substitute the operator itself. The presented method uses both lidars and an onboard camera. The trajectory from the
starting position to the approximate area of the docking station is computed using data obtained from the three lidars around the AMR body. The final approach is implemented by detecting an ARUCO code positioned on the dock
assembly through a camera. A sequence of intermediate positions is defined according to the pose estimations, and
then reached with a mix of standard navigation and a proportional position control in the very last part of the movement trajectory. The precision of the docking position turned out to have less than one centimeter error
around the desired target, the orientation error is a fraction of a degree. The docking times vary based on how far the AMR is from the docking station, but the last phase of the procedure is always completed in around seventeen seconds. The solution is implementable and will be evaluated on the real platform in the coming
months
An Information Theoretic Framework for Camera and Lidar Sensor Data Fusion and its Applications in Autonomous Navigation of Vehicles.
This thesis develops an information theoretic framework for multi-modal sensor data fusion for robust autonomous navigation of vehicles. In particular we focus on the registration of 3D lidar and camera data, which are commonly used perception sensors in mobile robotics. This thesis presents a framework that allows the fusion of the two modalities, and uses this fused information to enhance state-of-the-art registration algorithms used in robotics applications. It is important to note that the time-aligned discrete signals (3D points and their reflectivity from lidar, and pixel location and color from camera) are generated by sampling the same physical scene, but in a different manner. Thus, although these signals look quite different at a high level (2D image from a camera looks entirely different than a 3D point cloud of the same scene from a lidar), since they are generated from the same physical scene, they are statistically dependent upon each other at the signal level. This thesis exploits this statistical dependence in an information theoretic framework to solve some of the common problems encountered in autonomous navigation tasks such as sensor calibration, scan registration and place recognition. In a general sense we consider these perception sensors as a source of information (i.e., sensor data), and the statistical dependence of this information (obtained from different modalities) is used to solve problems related to multi-modal sensor data registration.PHDElectrical Engineering: SystemsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/107286/1/pgaurav_1.pd
Detection and Localisation Using Light
Visible light communication (VLC) systems have become promising candidates to complement conventional radio frequency (RF) systems due to the increasingly saturated RF spectrum and the potentially high data rates that can be achieved by VLC systems. Furthermore, people detection and counting in an indoor environment has become an emerging and attractive area in the past decade. Many techniques and systems have been developed for counting in public places such as subways, bus stations and supermarkets. The outcome of these techniques can be used for public security, resource allocation and marketing decisions.
This thesis presents the first indoor light-based detection and localisation system that builds on concepts from radio detection and ranging (radar) making use of the expected growth in the use and adoption of visible light communication (VLC), which can provide the infrastructure for our light detection and localisation (LiDAL) system. Our system enables active detection, counting and localisation of people, in addition to being fully compatible with existing VLC systems. In order to detect human (targets), LiDAL uses the visible light spectrum. It sends pulses using a VLC transmitter and analyses the reflected signal collected by an optical receiver. Although we examine the use of the visible spectrum here, LiDAL can be used in the infrared spectrum and other parts of the light spectrum.
We introduce LiDAL with different transmitter-receiver configurations and optimum detectors considering the fluctuation of the received reflected signal from the target in the presence of Gaussian noise. We design an efficient multiple input multiple output (MIMO) LiDAL system with wide field of view (FOV) single photodetector receiver, and also design a multiple input single output (MISO) LiDAL system with an imaging receiver to eliminate ambiguity in target detection and localisation.
We develop models for the human body and its reflections and consider the impact of the colour and texture of the cloth used as well as the impact of target mobility. A number of detection and localisation methods are developed
iii
for our LiDAL system including cross correlation, a background subtraction method and a background estimation method. These methods are considered to distinguish a mobile target from the ambient reflections due to background obstacles (furniture) in a realistic indoor environment
Mobile Robots Navigation
Mobile robots navigation includes different interrelated activities: (i) perception, as obtaining and interpreting sensory information; (ii) exploration, as the strategy that guides the robot to select the next direction to go; (iii) mapping, involving the construction of a spatial representation by using the sensory information perceived; (iv) localization, as the strategy to estimate the robot position within the spatial map; (v) path planning, as the strategy to find a path towards a goal location being optimal or not; and (vi) path execution, where motor actions are determined and adapted to environmental changes. The book addresses those activities by integrating results from the research work of several authors all over the world. Research cases are documented in 32 chapters organized within 7 categories next described
Learning cognitive maps: Finding useful structure in an uncertain world
In this chapter we will describe the central mechanisms that influence how people learn about large-scale space. We will focus particularly on how these mechanisms enable people to effectively cope with both the uncertainty inherent in a constantly changing world and also with the high information content of natural environments. The major lessons are that humans get by with a less is more approach to building structure, and that they are able to quickly adapt to environmental changes thanks to a range of general purpose mechanisms. By looking at abstract principles, instead of concrete implementation details, it is shown that the study of human learning can provide valuable lessons for robotics. Finally, these issues are discussed in the context of an implementation on a mobile robot. © 2007 Springer-Verlag Berlin Heidelberg
Smart clothing and furniture for supporting participation-co-creation concepts for daily living
Participation and social inclusion influence individuals’ health and well-being. These factors can be easily disturbed, especially for those with disabilities. Designers and engineers have tried harnessing technology to assist people via producing prototypes of assistive devices, such as smart clothing and furniture. This study approaches that user surface and inspects the user’s needs for participation through clothing and furniture. We thus arranged two similar workshops with student participants (n = 37) from four different educational units, creating 10 innovative concepts to support participation and social inclusion. All aimed to support participation via improved self-regulation, increased safety, or environmental control. Most of the concepts were connectible to another device, such as a mobile phone. All devices were made adjustable to meet personal preferences. This study aligns with previous ones by concluding that assistive technology should be unobtrusive, give timely responses, and interact with other devices. These initial concepts are ready to be turned into tangible prototypes. Article highlightsParticipation and social inclusion have remarkable meaning for an individual’s well-being and health. Commonly, assistive technology aims to solve challenges in daily living by promoting health and well-being. For this reason, we arranged two similar co-creation workshops and asked the participants to innovate smart clothing and furniture concepts that will promote greater participation and more social inclusion.This study also identified users’ needs, such as increased safety and independence, supported communication, self-regulation and awareness, and an effective learning tool.The majority of the concepts were designed to be adjustable to meet personal preferences, let individuals interact with other devices (such as a mobile phone), and give timely responses.publishedVersionPeer reviewe
- …