Search CORE

2,058 research outputs found

Fine-To-Coarse Global Registration of RGB-D Scans

Author: Funkhouser Thomas
Halber Maciej
Publication venue
Publication date: 22/11/2016
Field of study

RGB-D scanning of indoor environments is important for many applications, including real estate, interior design, and virtual reality. However, it is still challenging to register RGB-D images from a hand-held camera over a long video sequence into a globally consistent 3D model. Current methods often can lose tracking or drift and thus fail to reconstruct salient structures in large environments (e.g., parallel walls in different rooms). To address this problem, we propose a "fine-to-coarse" global registration algorithm that leverages robust registrations at finer scales to seed detection and enforcement of new correspondence and structural constraints at coarser scales. To test global registration algorithms, we provide a benchmark with 10,401 manually-clicked point correspondences in 25 scenes from the SUN3D dataset. During experiments with this benchmark, we find that our fine-to-coarse algorithm registers long RGB-D sequences better than previous methods

arXiv.org e-Print Archive

Princeton University Open Access Repository

Crossref

Exploring Convolutional Networks for End-to-End Visual Servoing

Author: Gaud Ayush
Krishna K. Madhava
Kumar Gourav
Pandya Harit
Saxena Aseem
Publication venue
Publication date: 10/06/2017
Field of study

Present image based visual servoing approaches rely on extracting hand crafted visual features from an image. Choosing the right set of features is important as it directly affects the performance of any approach. Motivated by recent breakthroughs in performance of data driven methods on recognition and localization tasks, we aim to learn visual feature representations suitable for servoing tasks in unstructured and unknown environments. In this paper, we present an end-to-end learning based approach for visual servoing in diverse scenes where the knowledge of camera parameters and scene geometry is not available a priori. This is achieved by training a convolutional neural network over color images with synchronised camera poses. Through experiments performed in simulation and on a quadrotor, we demonstrate the efficacy and robustness of our approach for a wide range of camera poses in both indoor as well as outdoor environments.Comment: IEEE ICRA 201

arXiv.org e-Print Archive

Crossref

RGBDTAM: A Cost-Effective and Accurate RGB-D Tracking and Mapping System

Author: Civera Javier
Concha Alejo
Publication venue
Publication date: 09/08/2017
Field of study

Simultaneous Localization and Mapping using RGB-D cameras has been a fertile research topic in the latest decade, due to the suitability of such sensors for indoor robotics. In this paper we propose a direct RGB-D SLAM algorithm with state-of-the-art accuracy and robustness at a los cost. Our experiments in the RGB-D TUM dataset [34] effectively show a better accuracy and robustness in CPU real time than direct RGB-D SLAM systems that make use of the GPU. The key ingredients of our approach are mainly two. Firstly, the combination of a semi-dense photometric and dense geometric error for the pose tracking (see Figure 1), which we demonstrate to be the most accurate alternative. And secondly, a model of the multi-view constraints and their errors in the mapping and tracking threads, which adds extra information over other approaches. We release the open-source implementation of our approach 1 . The reader is referred to a video with our results 2 for a more illustrative visualization of its performance

arXiv.org e-Print Archive

Crossref

Dynamic Body VSLAM with Semantic Constraints

Author: Chari Visesh
Krishna K. Madhava
Reddy N. Dinesh
Singhal Prateek
Publication venue
Publication date: 27/04/2015
Field of study

Image based reconstruction of urban environments is a challenging problem that deals with optimization of large number of variables, and has several sources of errors like the presence of dynamic objects. Since most large scale approaches make the assumption of observing static scenes, dynamic objects are relegated to the noise modeling section of such systems. This is an approach of convenience since the RANSAC based framework used to compute most multiview geometric quantities for static scenes naturally confine dynamic objects to the class of outlier measurements. However, reconstructing dynamic objects along with the static environment helps us get a complete picture of an urban environment. Such understanding can then be used for important robotic tasks like path planning for autonomous navigation, obstacle tracking and avoidance, and other areas. In this paper, we propose a system for robust SLAM that works in both static and dynamic environments. To overcome the challenge of dynamic objects in the scene, we propose a new model to incorporate semantic constraints into the reconstruction algorithm. While some of these constraints are based on multi-layered dense CRFs trained over appearance as well as motion cues, other proposed constraints can be expressed as additional terms in the bundle adjustment optimization process that does iterative refinement of 3D structure and camera / object motion trajectories. We show results on the challenging KITTI urban dataset for accuracy of motion segmentation and reconstruction of the trajectory and shape of moving objects relative to ground truth. We are able to show average relative error reduction by a significant amount for moving object trajectory reconstruction relative to state-of-the-art methods like VISO 2, as well as standard bundle adjustment algorithms

arXiv.org e-Print Archive

Crossref

Indoor Mapping and Reconstruction with Mobile Augmented Reality Sensor Systems

Author: Hübner Patrick
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 23/01/2022
Field of study

Augmented Reality (AR) ermöglicht es, virtuelle, dreidimensionale Inhalte direkt innerhalb der realen Umgebung darzustellen. Anstatt jedoch beliebige virtuelle Objekte an einem willkürlichen Ort anzuzeigen, kann AR Technologie auch genutzt werden, um Geodaten in situ an jenem Ort darzustellen, auf den sich die Daten beziehen. Damit eröffnet AR die Möglichkeit, die reale Welt durch virtuelle, ortbezogene Informationen anzureichern. Im Rahmen der vorliegenen Arbeit wird diese Spielart von AR als "Fused Reality" definiert und eingehend diskutiert. Der praktische Mehrwert, den dieses Konzept der Fused Reality bietet, lässt sich gut am Beispiel seiner Anwendung im Zusammenhang mit digitalen Gebäudemodellen demonstrieren, wo sich gebäudespezifische Informationen - beispielsweise der Verlauf von Leitungen und Kabeln innerhalb der Wände - lagegerecht am realen Objekt darstellen lassen. Um das skizzierte Konzept einer Indoor Fused Reality Anwendung realisieren zu können, müssen einige grundlegende Bedingungen erfüllt sein. So kann ein bestimmtes Gebäude nur dann mit ortsbezogenen Informationen augmentiert werden, wenn von diesem Gebäude ein digitales Modell verfügbar ist. Zwar werden größere Bauprojekt heutzutage oft unter Zuhilfename von Building Information Modelling (BIM) geplant und durchgeführt, sodass ein digitales Modell direkt zusammen mit dem realen Gebäude ensteht, jedoch sind im Falle älterer Bestandsgebäude digitale Modelle meist nicht verfügbar. Ein digitales Modell eines bestehenden Gebäudes manuell zu erstellen, ist zwar möglich, jedoch mit großem Aufwand verbunden. Ist ein passendes Gebäudemodell vorhanden, muss ein AR Gerät außerdem in der Lage sein, die eigene Position und Orientierung im Gebäude relativ zu diesem Modell bestimmen zu können, um Augmentierungen lagegerecht anzeigen zu können. Im Rahmen dieser Arbeit werden diverse Aspekte der angesprochenen Problematik untersucht und diskutiert. Dabei werden zunächst verschiedene Möglichkeiten diskutiert, Indoor-Gebäudegeometrie mittels Sensorsystemen zu erfassen. Anschließend wird eine Untersuchung präsentiert, inwiefern moderne AR Geräte, die in der Regel ebenfalls über eine Vielzahl an Sensoren verfügen, ebenfalls geeignet sind, als Indoor-Mapping-Systeme eingesetzt zu werden. Die resultierenden Indoor Mapping Datensätze können daraufhin genutzt werden, um automatisiert Gebäudemodelle zu rekonstruieren. Zu diesem Zweck wird ein automatisiertes, voxel-basiertes Indoor-Rekonstruktionsverfahren vorgestellt. Dieses wird außerdem auf der Grundlage vierer zu diesem Zweck erfasster Datensätze mit zugehörigen Referenzdaten quantitativ evaluiert. Desweiteren werden verschiedene Möglichkeiten diskutiert, mobile AR Geräte innerhalb eines Gebäudes und des zugehörigen Gebäudemodells zu lokalisieren. In diesem Kontext wird außerdem auch die Evaluierung einer Marker-basierten Indoor-Lokalisierungsmethode präsentiert. Abschließend wird zudem ein neuer Ansatz, Indoor-Mapping Datensätze an den Achsen des Koordinatensystems auszurichten, vorgestellt

KITopen