122 research outputs found
Towards Live 3D Reconstruction from Wearable Video: An Evaluation of V-SLAM, NeRF, and Videogrammetry Techniques
Mixed reality (MR) is a key technology which promises to change the future of
warfare. An MR hybrid of physical outdoor environments and virtual military
training will enable engagements with long distance enemies, both real and
simulated. To enable this technology, a large-scale 3D model of a physical
environment must be maintained based on live sensor observations. 3D
reconstruction algorithms should utilize the low cost and pervasiveness of
video camera sensors, from both overhead and soldier-level perspectives.
Mapping speed and 3D quality can be balanced to enable live MR training in
dynamic environments. Given these requirements, we survey several 3D
reconstruction algorithms for large-scale mapping for military applications
given only live video. We measure 3D reconstruction performance from common
structure from motion, visual-SLAM, and photogrammetry techniques. This
includes the open source algorithms COLMAP, ORB-SLAM3, and NeRF using
Instant-NGP. We utilize the autonomous driving academic benchmark KITTI, which
includes both dashboard camera video and lidar produced 3D ground truth. With
the KITTI data, our primary contribution is a quantitative evaluation of 3D
reconstruction computational speed when considering live video.Comment: Accepted to 2022 Interservice/Industry Training, Simulation, and
Education Conference (I/ITSEC), 13 page
Real-Time Multi-Fisheye Camera Self-Localization and Egomotion Estimation in Complex Indoor Environments
In this work a real-time capable multi-fisheye camera self-localization and egomotion estimation framework is developed. The thesis covers all aspects ranging from omnidirectional camera calibration to the development of a complete multi-fisheye camera SLAM system based on a generic multi-camera bundle adjustment method
MonoSLAM: Real-time single camera SLAM
Published versio
Towards Fast and Automatic Map Initialization for Monocular SLAM Systems
Simultaneous localization and mapping (SLAM) is a widely adopted approach for estimating the pose of a sensor with 6 degrees of freedom. SLAM works by using sensor measurements to initialize and build a virtual map of the environment, while simultaneously matching succeeding sensor measurements to entries in the map to perform robust pose estimation of the sensor on each measurement cycle. Markerless, single-camera systems that utilize SLAM usually involve initializing the map by applying one of a few structure-from-motion approaches to two frames taken by the system at different points in time. However, knowing when the feature matches between two frames will yield enough disparity, parallax, and/or structure for a good initialization to take place remains an open problem. To make this determination, we train a number of logistic regression models on summarized correspondence data for 927 stereo image pairs. Our results show that these models classify with significantly higher precision than the current state-of-the-art approach in addition to remaining computationally inexpensive
Odometria visual monocular em robôs para a agricultura com camara(s) com lentes "olho de peixe"
One of the main challenges in robotics is to develop accurate localization methods that achieve acceptable runtime performances.One of the most common approaches is to use Global Navigation Satellite System such as GPS to localize robots.However, satellite signals are not full-time available in some kind of environments.The purpose of this dissertation is to develop a localization system for a ground robot.This robot is inserted in a project called RoMoVi and is intended to perform tasks like crop monitoring and harvesting in steep slope vineyards.This vineyards are localized in the Douro region which are characterized by the presence of high hills.Thus, the context of RoMoVi is not prosperous for the use of GPS-based localization systems.Therefore, the main goal of this work is to create a reliable localization system based on vision techniques and low cost sensors.To do so, a Visual Odometry system will be used.The concept of Visual Odometry is equivalent to wheel odometry but it has the advantage of not suffering from wheel slip which is present in these kind of environments due to the harsh terrain conditions.Here, motion is tracked computing the homogeneous transformation between camera frames, incrementally.However, this approach also presents some open issues.Most of the state of art methods, specially those who present a monocular camera system, don't perform good motion estimations in pure rotations.In some of them, motion even degenerates in these situations.Also, computing the motion scale is a difficult task that is widely investigated in this field.This work is intended to solve these issues.To do so, fisheye lens cameras will be used in order to achieve wide vision field of views
Simultaneous Localization and Mapping Technologies
Il problema dello SLAM (Simultaneous Localization And Mapping) consiste nel mappare un ambiente sconosciuto per mezzo di un dispositivo che si muove al suo interno, mentre si effettua la localizzazione di quest'ultimo.
All'interno di questa tesi viene analizzato il problema dello SLAM e le differenze che lo contraddistinguono dai problemi di mapping e di localizzazione trattati separatamente.
In seguito, si effettua una analisi dei principali algoritmi impiegati al giorno d'oggi per la sua risoluzione, ovvero i filtri estesi di Kalman e i particle filter.
Si analizzano poi le diverse tecnologie implementative esistenti, tra le quali figurano sistemi SONAR, sistemi LASER, sistemi di visione e sistemi RADAR; questi ultimi, allo stato dell'arte, impiegano onde millimetriche (mmW) e a banda larga (UWB), ma anche tecnologie radio già affermate, fra le quali il Wi-Fi.
Infine, vengono effettuate delle simulazioni di tecnologie basate su sistema di visione e su sistema LASER, con l'ausilio di due pacchetti open source di MATLAB. Successivamente, il pacchetto progettato per sistemi LASER è stato modificato al fine di simulare una tecnologia SLAM basata su segnali Wi-Fi.
L'utilizzo di tecnologie a basso costo e ampiamente diffuse come il Wi-Fi apre alla possibilità , in un prossimo futuro, di effettuare localizzazione indoor a basso costo, sfruttando l'infrastruttura esistente, mediante un semplice smartphone. Più in prospettiva, l'avvento della tecnologia ad onde millimetriche (5G) consentirà di raggiungere prestazioni maggiori
Cross-View Visual Geo-Localization for Outdoor Augmented Reality
Precise estimation of global orientation and location is critical to ensure a
compelling outdoor Augmented Reality (AR) experience. We address the problem of
geo-pose estimation by cross-view matching of query ground images to a
geo-referenced aerial satellite image database. Recently, neural network-based
methods have shown state-of-the-art performance in cross-view matching.
However, most of the prior works focus only on location estimation, ignoring
orientation, which cannot meet the requirements in outdoor AR applications. We
propose a new transformer neural network-based model and a modified triplet
ranking loss for joint location and orientation estimation. Experiments on
several benchmark cross-view geo-localization datasets show that our model
achieves state-of-the-art performance. Furthermore, we present an approach to
extend the single image query-based geo-localization approach by utilizing
temporal information from a navigation pipeline for robust continuous
geo-localization. Experimentation on several large-scale real-world video
sequences demonstrates that our approach enables high-precision and stable AR
insertion.Comment: IEEE VR 202
Automated Map Reading: Image Based Localisation in 2-D Maps Using Binary Semantic Descriptors
We describe a novel approach to image based localisation in urban
environments using semantic matching between images and a 2-D map. It contrasts
with the vast majority of existing approaches which use image to image database
matching. We use highly compact binary descriptors to represent semantic
features at locations, significantly increasing scalability compared with
existing methods and having the potential for greater invariance to variable
imaging conditions. The approach is also more akin to human map reading, making
it more suited to human-system interaction. The binary descriptors indicate the
presence or not of semantic features relating to buildings and road junctions
in discrete viewing directions. We use CNN classifiers to detect the features
in images and match descriptor estimates with a database of location tagged
descriptors derived from the 2-D map. In isolation, the descriptors are not
sufficiently discriminative, but when concatenated sequentially along a route,
their combination becomes highly distinctive and allows localisation even when
using non-perfect classifiers. Performance is further improved by taking into
account left or right turns over a route. Experimental results obtained using
Google StreetView and OpenStreetMap data show that the approach has
considerable potential, achieving localisation accuracy of around 85% using
routes corresponding to approximately 200 meters.Comment: 8 pages, submitted to IEEE/RSJ International Conference on
Intelligent Robots and Systems 201
- …