Search CORE

101 research outputs found

A robust and fast method for 6DoF motion estimation from generalized 3D data

Author: Cazorla Miguel
Viejo Hernando Diego
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Nowadays, there is an increasing number of robotic applications that need to act in real three-dimensional (3D) scenarios. In this paper we present a new mobile robotics orientated 3D registration method that improves previous Iterative Closest Points based solutions both in speed and accuracy. As an initial step, we perform a low cost computational method to obtain descriptions for 3D scenes planar surfaces. Then, from these descriptions we apply a force system in order to compute accurately and efficiently a six degrees of freedom egomotion. We describe the basis of our approach and demonstrate its validity with several experiments using different kinds of 3D sensors and different 3D real environments.This work has been supported by project DPI2009-07144 from Ministerio de Educación y Ciencia (Spain) and GRE10-35 from Universidad de Alicante (Spain)

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

FLAT2D: Fast localization from approximate transformation into 2D

Author: Goeddel Robert
Kershaw Carl
Olson Edwin
SERAFIN JACOPO
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

Many autonomous vehicles require precise localization into a prior map in order to support planning and to leverage semantic information within those maps (e.g. that the right lane is a turn-only lane.) A popular approach in automotive systems is to use infrared intensity maps of the ground surface to localize, making them susceptible to failures when the surface is obscured by snow or when the road is repainted. An emerging alternative is to localize based on the 3D structure around the vehicle; these methods are robust to these types of changes, but the maps are costly both in terms of storage and the computational cost of matching. In this paper, we propose a fast method for localizing based on 3D structure around the vehicle using a 2D representation. This representation retains many of the advantages of "full" matching in 3D, but comes with dramatically lower space and computational requirements. We also introduce a variation of Graph-SLAM tailored to support localization, allowing us to make use of graph-based error-recovery techniques in our localization estimate. Finally, we present real-world localization results for both an indoor mobile robotic platform and an autonomous golf cart, demonstrating that autonomous vehicles do not need full 3D matching to accurately localize in the environment

Crossref

Archivio della ricerca- Università di Roma La Sapienza

Visual perception for the 3D recognition of geometric pieces in robotic manipulation

Author: Gil Pablo
Mateo Agulló Carlos
Torres Fernando
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

During grasping and intelligent robotic manipulation tasks, the camera position relative to the scene changes dramatically because the robot is moving to adapt its path and correctly grasp objects. This is because the camera is mounted at the robot effector. For this reason, in this type of environment, a visual recognition system must be implemented to recognize and “automatically and autonomously” obtain the positions of objects in the scene. Furthermore, in industrial environments, all objects that are manipulated by robots are made of the same material and cannot be differentiated by features such as texture or color. In this work, first, a study and analysis of 3D recognition descriptors has been completed for application in these environments. Second, a visual recognition system designed from specific distributed client-server architecture has been proposed to be applied in the recognition process of industrial objects without these appearance features. Our system has been implemented to overcome problems of recognition when the objects can only be recognized by geometric shape and the simplicity of shapes could create ambiguity. Finally, some real tests are performed and illustrated to verify the satisfactory performance of the proposed system.The research leading to these result has received funding from the Spanish Government and European FEDER funds (DPI2012-32390) and the Valencia Regional Government (PROMETEO/2013/085)

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Robotic Pick-and-Place of Novel Objects in Clutter with Multi-Affordance Grasping and Cross-Domain Image Matching

Author: Alet Ferran
Bauza Maria
Dafle Nikhil Chavan
Donlon Elliott
Fazeli Nima
Funkhouser Thomas
Green Druck
Hogan Francois R.
Holladay Rachel
Liu Melody
Liu Weber
Ma Daolin
Morona Isabella
Nair Prem Qu
Rodriguez Alberto
Romo Eudald
Song Shuran
Taylor Ian
Taylor Orion
Yu Kuan-Ting
Zeng Andy
Publication venue
Publication date: 01/01/2018
Field of study

This paper presents a robotic pick-and-place system that is capable of grasping and recognizing both known and novel objects in cluttered environments. The key new feature of the system is that it handles a wide range of object categories without needing any task-specific training data for novel objects. To achieve this, it first uses a category-agnostic affordance prediction algorithm to select and execute among four different grasping primitive behaviors. It then recognizes picked objects with a cross-domain image classification framework that matches observed images to product images. Since product images are readily available for a wide range of objects (e.g., from the web), the system works out-of-the-box for novel objects without requiring any additional training data. Exhaustive experimental results demonstrate that our multi-affordance grasping achieves high success rates for a wide variety of objects in clutter, and our recognition algorithm achieves high accuracy for both known and novel grasped objects. The approach was part of the MIT-Princeton Team system that took 1st place in the stowing task at the 2017 Amazon Robotics Challenge. All code, datasets, and pre-trained models are available online at http://arc.cs.princeton.eduComment: Project webpage: http://arc.cs.princeton.edu Summary video: https://youtu.be/6fG7zwGfIk

arXiv.org e-Print Archive

Princeton University Open Access Repository

DSpace@MIT

Crossref

Recommended from our members

Vision-based Manipulation In-the-Wild

Author: Chi Cheng
Publication venue
Publication date: 01/01/2024
Field of study

Deploying robots in real-world environments involves immense engineering complexity, potentially surpassing the resources required for autonomous vehicles due to the increased dimensionality and task variety. To maximize the chances of successful real-world deployment, finding a simple solution that minimizes engineering complexity at every level, from hardware to algorithm to operations, is crucial. In this dissertation, we consider a vision-based manipulation system that can be deployed in-the-wild when trained to imitate sufficient quantity and diversity of human demonstration data on the desired task. At deployment time, the robot is driven by a single diffusion-based visuomotor policy, with raw RGB images as input and robot end-effector pose as output. Compared to existing policy representations, Diffusion Policy handles multimodal action distributions gracefully, being scalable to high-dimensional action spaces and exhibiting impressive training stability. These properties allow a single software system to be used for multiple tasks, with data collected by multiple demonstrators, deployed to multiple robot embodiments, and without significant hyper-parameter tuning. We developed a Universal Manipulation Interface (UMI), a portable, low-cost, and information-rich data collection system to enable direct manipulation skill learning from in-the-wild human demonstrations. UMI provides an intuitive interface for non-expert users by using hand-held grippers with mounted GoPro cameras. Compared to existing robotic data collection systems, UMI enables robotic data collection without needing a robot, drastically reducing the engineering and operational complexity. Trained with UMI data, the resulting diffusion policies can be deployed across multiple robot platforms in unseen environments for novel objects and to complete dynamic, bimanual, precise, and long-horizon tasks. The Diffusion Policy and UMI combination provides a simple full-stack solution to many manipulation problems. The turn-around time of building a single-task manipulation system (such as object tossing and cloth folding) can be reduced from a few months to a few days

Columbia University Academic Commons

Registration of surfaces minimizing error propagation for a one-shot multi-slit hand-held scanner

Author: Batlle
Bergevin
Besl
Bottino
C. Matabosch
Chen
Curless
D. Fofi
E. Batlle
Fassi
Fitzgibbon
Forest
Forest
Gagnon
Gelfand
Huber
Huber
J. Salvi
Jarvis
Johnson
Krishnan
Lu
Masuda
Matabosch
Neugebauer
Nüchter
Park
Pollefeys
Pulli
Rusinkiewicz
Salvi
Salvi
Sharp
Shiu
Silva
Stamos
Triggs
Turk
Vanden Wyngaerd
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Visual SLAM for Autonomous Navigation of MAVs

Author: Yang Shaowu
Publication venue: Universität Tübingen
Publication date: 01/08/2014
Field of study

This thesis focuses on developing onboard visual simultaneous localization and mapping (SLAM) systems to enable autonomous navigation of micro aerial vehicles (MAVs), which is still a challenging topic considering the limited payload and computational capability that an MAV normally has. In MAV applications, the visual SLAM systems are required to be very efficient, especially when other visual tasks have to be done in parallel. Furthermore, robustness in pose tracking is highly desired in order to enable safe autonomous navigation of an MAV in three-dimensional (3D) space. These challenges motivate the work in this thesis in the following aspects. Firstly, the problem of visual pose estimation for MAVs using an artificial landmark is addressed. An artificial neural network (ANN) is used to robustly recognize this visual marker in cluttered environments. Then a computational projective-geometry method is implemented for relative pose computation based on the retrieved geometry information of the visual marker. The presented vision system can be used not only for pose control of MAVs, but also for providing accurate pose estimates to a monocular visual SLAM system serving as an automatic initialization module for both indoor and outdoor environments. Secondly, autonomous landing on an arbitrarily textured landing site during autonomous navigation of an MAV is achieved. By integrating an efficient local-feature-based object detection algorithm within a monocular visual SLAM system, the MAV is able to search for the landing site autonomously along a predefined path, and land on it once it has been found. Thus, the proposed monocular visual solution enables autonomous navigation of an MAV in parallel with landing site detection. This solution relaxes the assumption made in conventional vision-guided landing systems, which is that the landing site should be located inside the field of view (FOV) of the vision system before initiating the landing task. The third problem that is addressed in this thesis is multi-camera visual SLAM for robust pose tracking of MAVs. Due to the limited FOV of a single camera, pose tracking using monocular visual SLAM may easily fail when the MAV navigates in unknown environments. Previous work addresses this problem mainly by fusing information from other sensors, like an inertial measurement unit (IMU), to achieve robustness of the whole system, which does not improve the robustness of visual SLAM itself. This thesis investigates solutions for improving the pose tracking robustness of a visual SLAM system by utilizing multiple cameras. A mathematical analysis of how measurements from multiple cameras should be integrated in the optimization of visual SLAM is provided. The resulting theory allows those measurements to be used for both robust pose tracking and map updating of the visual SLAM system. Furthermore, such a multi-camera visual SLAM system is modified to be a robust constant-time visual odometry. By integrating this visual odometry with an efficient back-end which consists of loop-closure detection and pose-graph optimization processes, a near-constant time multi-camera visual SLAM system is achieved for autonomous navigation of MAVs in large-scale environments.Diese Arbeit konzentriert sich auf die Entwicklung von integrierten Systemen zur gleichzeitigen Lokalisierung und Kartierung (Simultaneous Localization and Mapping, SLAM) mit Hilfe visueller Sensoren, um die autonome Navigation von kleinen Luftfahrzeugen (Micro Aerial Vehicles, MAVs) zu ermöglichen. Dies ist noch immer ein anspruchsvolles Thema angesichts der meist begrenzten Nutzlast und Rechenleistung eines MAVs. Die dafür eingesetzten visuellen SLAM Systeme müssen sehr effizient zu sein, vor allem wenn parallel noch andere visuelle Aufgaben durchgeführt werden sollen. Darüber hinaus ist eine robuste Positionsschätzung sehr wichtig, um die sichere autonome Navigation des MAVs im dreidimensionalen (3D) Raum zu ermöglichen. Diese Herausforderungen motivieren die vorliegende Arbeit gemäß den folgenden Gesichtspunkten: Zuerst wird das Problem bearbeitet, die Pose eines MAVs mit Hilfe einer künstlichen Markierung visuell zu schätzen. Ein künstliches neuronales Netz wird verwendet, um diese visuelle Markierung auch in anspruchsvollen Umgebungen zuverlässig zu erkennen. Anschließend wird ein Verfahren aus der projektiven Geometrie eingesetzt, um die relative Pose basierend auf der gemessenen Geometrie der visuellen Markierung zu ermitteln. Das vorgestellte Bildverarbeitungssystem kann nicht nur zur Regelung der Pose des MAVs verwendet werden, sondern auch genaue Posenschätzungen zur automatischen Initialisierung eines monokularen visuellen SLAM-Systems im Innen- und Außenbereich liefern. Anschließend wird die autonome Landung eines MAVs auf einem beliebig texturierten Landeplatz während autonomer Navigation erreicht. Durch die Integration eines effizienten Objekterkennungsalgorithmus, basierend auf lokalen Bildmerkmalen in einem monokularen visuellen SLAM-System, ist das MAV in der Lage den Landeplatz autonom entlang einer vorgegebenen Strecke zu suchen, und auf ihm zu landen sobald er gefunden wurde. Die vorgestellte Lösung ermöglicht somit die autonome Navigation eines MAVs bei paralleler Landeplatzerkennung. Diese Lösung lockert die gängige Annahme in herkömmlichen Systemen zum kamerageführten Landen, dass der Landeplatz vor Beginn der Landung innerhalb des Sichtfelds des Bildverarbeitungssystems liegen muss. Das dritte in dieser Arbeit bearbeitete Problem ist visuelles SLAM mit mehreren Kameras zur robusten Posenschätzung für MAVs. Aufgrund des begrenzten Sichtfelds von einer einzigen Kamera kann die Posenschätzung von monokularem visuellem SLAM leicht fehlschlagen, wenn sich das MAV in einer unbekannten Umgebung bewegt. Frühere Arbeiten versutchen dieses Problem hauptsächlich durch die Fusionierung von Informationen anderer Sensoren, z.B. eines Inertialsensors (Inertial Measurement Unit, IMU) zu lösen um eine höhere Robustheit des Gesamtsystems zu erreichen, was die Robustheit des visuellen SLAM-Systems selbst nicht verbessert. Die vorliegende Arbeit untersucht Lösungen zur Verbesserung der Robustheit der Posenschätzung eines visuellen SLAM-Systems durch die Verwendung mehrerer Kameras. Wie Messungen von mehreren Kameras in die Optimierung für visuelles SLAM integriert werden können wird mathematisch analysiert. Die daraus resultierende Theorie erlaubt die Nutzung dieser Messungen sowohl zur robusten Posenschätzung als auch zur Aktualisierung der visuellen Karte. Ferner wird ein solches visuelles SLAM-System mit mehreren Kameras modifiziert, um in konstanter Laufzeit robuste visuelle Odometrie zu berechnen. Die Integration dieser visuellen Odometrie mit einem effizienten Back-End zur Erkennung von geschlossener Schleifen und der Optimierung des Posengraphen ermöglicht ein visuelles SLAM-System mit mehreren Kameras und fast konstanter Laufzeit zur autonomen Navigation von MAVs in großen Umgebungen

Publikationsserver der Universität Tübingen

Visual Odometry and Traversability Analysis for Wheeled Robots in Complex Environments

Author: Jordan Julian
Publication venue: Universität Tübingen
Publication date: 01/01/2021
Field of study

Durch die technische Entwicklung im Bereich der radbasierten mobilen Roboter (WMRs) erweitern sich deren Anwendungsszenarien. Neben den eher strukturierten industriellen und häuslichen Umgebungen sind nun komplexere städtische Szenarien oder Außenbereiche mögliche Einsatzgebiete. Einer dieser neuen Anwendungsfälle wird in dieser Arbeit beschrieben: ein intelligenter persönlicher Mobilitätsassistent, basierend auf einem elektrischen Rollator. Ein solches System hat mehrere Anforderungen: Es muss sicher, robust, leicht und preiswert sein und sollte in der Lage sein, in Echtzeit zu navigieren, um eine direkte physische Interaktion mit dem Benutzer zu ermöglichen. Da diese Eigenschaften für fast alle Arten von WMRs wünschenswert sind, können alle in dieser Arbeit präsentierten Methoden auch mit anderen Typen von WMRs verwendet werden. Zuerst wird eine visuelle Odometriemethode vorgestellt, welche auf die Arbeit mit einer nach unten gerichteten RGB-D-Kamera ausgelegt ist. Hierzu wird die Umgebung auf die Bodenebene projiziert, um eine 2-dimensionale Repräsentation zu erhalten. Nun wird ein effizientes Bildausrichtungsverfahren verwendet, um die Fahrzeugbewegung aus aufeinander folgenden Bildern zu schätzen. Da das Verfahren für den Einsatz auf einem WMR ausgelegt ist, können weitere Annahmen verwendet werden, um die Genauigkeit der visuellen Odometrie zu verbessern. Für einen nicht-holonomischen WMR mit einem bekannten Fahrzeugmodell, entweder Differentialantrieb, Skid-Lenkung oder Ackermann-Lenkung, können die Bewegungsparameter direkt aus den Bilddaten geschätzt werden. Dies verbessert die Genauigkeit und Robustheit des Verfahrens erheblich. Zusätzlich wird eine Ausreißererkennung vorgestellt, die im Modellraum, d.h. den Bewegungsparametern des kinematischen Models, arbeitet. Üblicherweise wird die Ausreißererkennung im Datenraum, d.h. auf den Bildpunkten, durchgeführt. Mittels der Projektion der Umgebung auf die Bodenebene kann auch eine Höhenkarte der Umgebung erstellt werde. Es wird untersucht, ob diese Karte, in Verbindung mit einem detaillierten Fahrzeugmodell, zur Abschätzung zukünftiger Fahrzeugposen verwendet werden kann. Durch die Verwendung einer gemeinsamen bildbasierten Darstellung der Umgebung und des Fahrzeugs wird eine sehr effiziente und dennoch sehr genaue Posenschätzmethode vorgeschlagen. Da die Befahrbarkeit eines Bereichs durch die Fahrzeugposen und mögliche Kollisionen bestimmt werden kann, wird diese Methode für eine neue echtzeitfähige Pfadplanung verwendet. Aus der Fahrzeugpose werden verschiedene Sicherheitskriterien bestimmt, die als Heuristik für einen A*-ähnlichen Planer verwendet werden. Hierzu werden mithilfe des kinematischen Models mögliche zukünftige Fahrzeugposen ermittelt und für jede dieser Posen ein Befahrbarkeitswert berechnet.Das endgültige System ermöglicht eine sichere und robuste Echtzeit-Navigation auch in schwierigen Innen- und Außenumgebungen.The application of wheeled mobile robots (WMRs) is currently expanding from rather controlled industrial or domestic scenarios into more complex urban or outdoor environments, allowing a variety of new use cases. One of these new use cases is described in this thesis: An intelligent personal mobility assistant, based on an electrical rollator. Such a system comes with several requirements: It must be safe and robust, lightweight, inexpensive and should be able to navigate in real-time in order to allow direct physical interaction with the user. As these properties are desirable for most WMRs, all methods proposed in this thesis can also be used with other WMR platforms.First, a visual odometry method is presented, which is tailored to work with a downward facing RGB-D camera. It projects the environment onto a ground plane image and uses an efficient image alignment method to estimate the vehicle motion from consecutive images. As the method is designed for use on a WMR, further constraints can be employed to improve the accuracy of the visual odometry. For a non-holonomic WMR with a known vehicle model, either differential drive, skid steering or Ackermann, the motion parameters of the corresponding kinematic model, instead of the generic motion parameters, can be estimated directly from the image data. This significantly improves the accuracyand robustness of the method. Additionally, an outlier rejection scheme is presented that operates in model space, i.e. the motion parameters of the kinematic model, instead of data space, i.e. image pixels. Furthermore, the projection of the environment onto the ground plane can also be used to create an elevation map of the environment. It is investigated if this map, in conjunction with a detailed vehicle model, can be used to estimate future vehicle poses. By using a common image-based representation of the environment and the vehicle, a very efficient and still highly accurate pose estimation method is proposed. Since the traversability of an area can be determined by the vehicle poses and potential collisions, the pose estimation method is employed to create a novel real-time path planning method. The detailed vehicle model is extended to also represent the vehicle’s chassis for collision detection. Guided by an A*-like planner, a search graph is constructed by propagating the vehicle using its kinematic model to possible future poses and calculating a traversability score for each of these poses. The final system performs safe and robust real-time navigation even in challenging indoor and outdoor environments

Publikationsserver der Universität Tübingen