1,174 research outputs found
Active Image-based Modeling with a Toy Drone
Image-based modeling techniques can now generate photo-realistic 3D models
from images. But it is up to users to provide high quality images with good
coverage and view overlap, which makes the data capturing process tedious and
time consuming. We seek to automate data capturing for image-based modeling.
The core of our system is an iterative linear method to solve the multi-view
stereo (MVS) problem quickly and plan the Next-Best-View (NBV) effectively. Our
fast MVS algorithm enables online model reconstruction and quality assessment
to determine the NBVs on the fly. We test our system with a toy unmanned aerial
vehicle (UAV) in simulated, indoor and outdoor experiments. Results show that
our system improves the efficiency of data acquisition and ensures the
completeness of the final model.Comment: To be published on International Conference on Robotics and
Automation 2018, Brisbane, Australia. Project Page:
https://huangrui815.github.io/active-image-based-modeling/ The author's
personal page: http://www.sfu.ca/~rha55
Development of a probabilistic perception system for camera-lidar sensor fusion
La estimación de profundidad usando diferentes sensores es uno de los desafíos clave para dotar a las máquinas autónomas de sólidas capacidades de percepción robótica. Ha habido un avance sobresaliente en el desarrollo de técnicas de estimación de profundidad unimodales basadas en cámaras monoculares, debido a su alta resolución o sensores LiDAR, debido a los datos geométricos precisos que proporcionan. Sin embargo, cada uno de ellos presenta inconvenientes inherentes, como la alta sensibilidad a los cambios en las condiciones de iluminación en el caso delas cámaras y la resolución limitada de los sensores LiDAR. La fusión de sensores se puede utilizar para combinar los méritos y compensar las desventajas de estos dos tipos de sensores. Sin embargo, los métodos de fusión actuales funcionan a un alto nivel. Procesan los flujos de datos de los sensores de forma independiente y combinan las estimaciones de alto nivel obtenidas para cada sensor. En este proyecto, abordamos el problema en un nivel bajo, fusionando los flujos de sensores sin procesar, obteniendo así estimaciones de profundidad que son densas y precisas, y pueden usarse como una fuente de datos multimodal unificada para problemas de estimación de nivel superior. Este trabajo propone un modelo de campo aleatorio condicional (CRF) con múltiples potenciales de geometría y apariencia que representa a la perfección el problema de estimar mapas de profundidad densos a partir de datos de cámara y LiDAR. El modelo se puede optimizar de manera eficiente utilizando el algoritmo Conjúgate Gradient Squared (CGS). El método propuesto se evalúa y compara utilizando el conjunto de datos proporcionado por KITTI Datset. Adicionalmente, se evalúa cualitativamente el modelo, usando datos adquiridos por el autor de esté trabajoMulti-modal depth estimation is one of the key challenges for endowing autonomous
machines with robust robotic perception capabilities. There has been an outstanding
advance in the development of uni-modal depth estimation techniques based
on either monocular cameras, because of their rich resolution or LiDAR sensors due
to the precise geometric data they provide. However, each of them suffers from some
inherent drawbacks like high sensitivity to changes in illumination conditions in
the case of cameras and limited resolution for the LiDARs. Sensor fusion can be
used to combine the merits and compensate the downsides of these two kinds of
sensors. Nevertheless, current fusion methods work at a high level. They processes
sensor data streams independently and combine the high level estimates obtained
for each sensor. In this thesis, I tackle the problem at a low level, fusing the raw
sensor streams, thus obtaining depth estimates which are both dense and precise,
and can be used as a unified multi-modal data source for higher level estimation
problems.
This work proposes a Conditional Random Field (CRF) model with multiple geometry
and appearance potentials that seamlessly represents the problem of estimating
dense depth maps from camera and LiDAR data. The model can be optimized
efficiently using the Conjugate Gradient Squared (CGS) algorithm. The proposed
method was evaluated and compared with the state-of-the-art using the commonly
used KITTI benchmark dataset. In addition, the model is qualitatively evaluated using
data acquired by the author of this work.MaestríaMagíster en Ingeniería de Desarrollo de Producto
Automatic Dense 3D Scene Mapping from Non-overlapping Passive Visual Sensors for Future Autonomous Systems
The ever increasing demand for higher levels of autonomy for robots and vehicles means there is an ever greater need for such systems to be aware of their surroundings. Whilst solutions already exist for creating 3D scene maps, many are based on active scanning devices such as laser scanners and depth cameras that are either expensive, unwieldy, or do not function well under certain environmental conditions. As a result passive cameras are a favoured sensor due their low cost, small size, and ability to work in a range of lighting conditions.
In this work we address some of the remaining research challenges within the problem of 3D mapping around a moving platform. We utilise prior work in dense stereo imaging, Stereo Visual Odometry (SVO) and extend Structure from Motion (SfM) to create a pipeline optimised for on vehicle sensing.
Using forward facing stereo cameras, we use state of the art SVO and dense stereo techniques to map the scene in front of the vehicle. With significant amounts of prior research in dense stereo, we addressed the issue of selecting an appropriate method by creating a novel evaluation technique. Visual 3D mapping of dynamic scenes from a moving platform result in duplicated scene objects. We extend the prior work on mapping by introducing a generalized dynamic object removal process. Unlike other approaches that rely on computationally expensive segmentation or detection, our method utilises existing data from the mapping stage and the findings from our dense stereo evaluation. We introduce a new SfM approach that exploits our platform motion to create a novel dense mapping process that exceeds the 3D data generation rate of state of the art alternatives. Finally, we combine dense stereo, SVO, and our SfM approach to automatically align point clouds from non-overlapping views to create a rotational and scale consistent global 3D model
A multisensor SLAM for dense maps of large scale environments under poor lighting conditions
This thesis describes the development and implementation of a multisensor large scale autonomous mapping system for surveying tasks in underground mines. The hazardous nature of the underground mining industry has resulted in a push towards autonomous solutions to the most dangerous operations, including surveying tasks. Many existing autonomous mapping techniques rely on approaches to the Simultaneous Localization and Mapping (SLAM) problem which are not suited to the extreme characteristics of active underground mining environments. Our proposed multisensor system has been designed from the outset to address the unique challenges associated with underground SLAM. The robustness, self-containment and portability of the system maximize the potential applications.The multisensor mapping solution proposed as a result of this work is based on a fusion of omnidirectional bearing-only vision-based localization and 3D laser point cloud registration. By combining these two SLAM techniques it is possible to achieve some of the advantages of both approaches – the real-time attributes of vision-based SLAM and the dense, high precision maps obtained through 3D lasers. The result is a viable autonomous mapping solution suitable for application in challenging underground mining environments.A further improvement to the robustness of the proposed multisensor SLAM system is a consequence of incorporating colour information into vision-based localization. Underground mining environments are often dominated by dynamic sources of illumination which can cause inconsistent feature motion during localization. Colour information is utilized to identify and remove features resulting from illumination artefacts and to improve the monochrome based feature matching between frames.Finally, the proposed multisensor mapping system is implemented and evaluated in both above ground and underground scenarios. The resulting large scale maps contained a maximum offset error of ±30mm for mapping tasks with lengths over 100m
Building with Drones: Accurate 3D Facade Reconstruction using MAVs
Automatic reconstruction of 3D models from images using multi-view
Structure-from-Motion methods has been one of the most fruitful outcomes of
computer vision. These advances combined with the growing popularity of Micro
Aerial Vehicles as an autonomous imaging platform, have made 3D vision tools
ubiquitous for large number of Architecture, Engineering and Construction
applications among audiences, mostly unskilled in computer vision. However, to
obtain high-resolution and accurate reconstructions from a large-scale object
using SfM, there are many critical constraints on the quality of image data,
which often become sources of inaccuracy as the current 3D reconstruction
pipelines do not facilitate the users to determine the fidelity of input data
during the image acquisition. In this paper, we present and advocate a
closed-loop interactive approach that performs incremental reconstruction in
real-time and gives users an online feedback about the quality parameters like
Ground Sampling Distance (GSD), image redundancy, etc on a surface mesh. We
also propose a novel multi-scale camera network design to prevent scene drift
caused by incremental map building, and release the first multi-scale image
sequence dataset as a benchmark. Further, we evaluate our system on real
outdoor scenes, and show that our interactive pipeline combined with a
multi-scale camera network approach provides compelling accuracy in multi-view
reconstruction tasks when compared against the state-of-the-art methods.Comment: 8 Pages, 2015 IEEE International Conference on Robotics and
Automation (ICRA '15), Seattle, WA, US
Development of a probabilistic perception system for camera-lidar sensor fusion
La estimación de profundidad usando diferentes sensores es uno de los desafíos clave para dotar a las máquinas autónomas de sólidas capacidades de percepción robótica. Ha habido un avance sobresaliente en el desarrollo de técnicas de estimación de profundidad unimodales basadas en cámaras monoculares, debido a su alta resolución o sensores LiDAR, debido a los datos geométricos precisos que proporcionan. Sin embargo, cada uno de ellos presenta inconvenientes inherentes, como la alta sensibilidad a los cambios en las condiciones de iluminación en el caso delas cámaras y la resolución limitada de los sensores LiDAR. La fusión de sensores se puede utilizar para combinar los méritos y compensar las desventajas de estos dos tipos de sensores. Sin embargo, los métodos de fusión actuales funcionan a un alto nivel. Procesan los flujos de datos de los sensores de forma independiente y combinan las estimaciones de alto nivel obtenidas para cada sensor. En este proyecto, abordamos el problema en un nivel bajo, fusionando los flujos de sensores sin procesar, obteniendo así estimaciones de profundidad que son densas y precisas, y pueden usarse como una fuente de datos multimodal unificada para problemas de estimación de nivel superior. Este trabajo propone un modelo de campo aleatorio condicional (CRF) con múltiples potenciales de geometría y apariencia que representa a la perfección el problema de estimar mapas de profundidad densos a partir de datos de cámara y LiDAR. El modelo se puede optimizar de manera eficiente utilizando el algoritmo Conjúgate Gradient Squared (CGS). El método propuesto se evalúa y compara utilizando el conjunto de datos proporcionado por KITTI Datset. Adicionalmente, se evalúa cualitativamente el modelo, usando datos adquiridos por el autor de esté trabajoMulti-modal depth estimation is one of the key challenges for endowing autonomous
machines with robust robotic perception capabilities. There has been an outstanding
advance in the development of uni-modal depth estimation techniques based
on either monocular cameras, because of their rich resolution or LiDAR sensors due
to the precise geometric data they provide. However, each of them suffers from some
inherent drawbacks like high sensitivity to changes in illumination conditions in
the case of cameras and limited resolution for the LiDARs. Sensor fusion can be
used to combine the merits and compensate the downsides of these two kinds of
sensors. Nevertheless, current fusion methods work at a high level. They processes
sensor data streams independently and combine the high level estimates obtained
for each sensor. In this thesis, I tackle the problem at a low level, fusing the raw
sensor streams, thus obtaining depth estimates which are both dense and precise,
and can be used as a unified multi-modal data source for higher level estimation
problems.
This work proposes a Conditional Random Field (CRF) model with multiple geometry
and appearance potentials that seamlessly represents the problem of estimating
dense depth maps from camera and LiDAR data. The model can be optimized
efficiently using the Conjugate Gradient Squared (CGS) algorithm. The proposed
method was evaluated and compared with the state-of-the-art using the commonly
used KITTI benchmark dataset. In addition, the model is qualitatively evaluated using
data acquired by the author of this work.MaestríaMagíster en Ingeniería de Desarrollo de Producto
DESIGNING AND EVALUATING A PORTABLE LIDAR-BASED SLAM SYSTEM
Mobile Mapping Technology (MMT) has evolved rapidly over the past few decades, especially in using low-cost sensors. This progress is primarily attributed to the appearance of innovative simultaneous localization and mapping (SLAM) algorithms. This article focuses on evaluating the efficiency of a new LiDAR-based portable SLAM system for mapping in dynamic real-world environments. The work proposed a technical solution based on a Livox Avia LiDAR sensor enhanced by gimbal stabilization. The system, named Portable Livox-based Mapping system (PoLiMap), is compared to other similar solutions by acquiring data from various environments, including urban sceneries, underground tunnels and forested areas, and processing them using a modified FAST-LIO-SLAM algorithm. The research presented in the article contributes to the understanding of the capabilities of PoLiMap systems under various conditions and offers significant insight into its potential applications. Accuracy evaluation results prove that the proposed MMT system can successfully tackle various demanding environments and challenge the results of other more costly state-of-the-art portable mobile laser scanning methods
Building a dense surface map incrementally from semi-dense point cloud and RGBimages
© 2015, Journal of Zhejiang University Science Editorial Office and Springer-Verlag Berlin Heidelberg. Building and using maps is a fundamental issue for bionic robots in field applications. A dense surface map, which offers rich visual and geometric information, is an ideal representation of the environment for indoor/outdoor localization, navigation, and recognition tasks of these robots. Since most bionic robots can use only small light-weight laser scanners and cameras to acquire semi-dense point cloud and RGB images, we propose a method to generate a consistent and dense surface map from this kind of semi-dense point cloud and RGB images. The method contains two main steps: (1) generate a dense surface for every single scan of point cloud and its corresponding image(s) and (2) incrementally fuse the dense surface of a new scan into the whole map. In step (1) edge-aware resampling is realized by segmenting the scan of a point cloud in advance and resampling each sub-cloud separately. Noise within the scan is reduced and a dense surface is generated. In step (2) the average surface is estimated probabilistically and the non-coincidence of different scans is eliminated. Experiments demonstrate that our method works well in both indoor and outdoor semi-structured environments where there are regularly shaped objects
Multi-environment Georeferencing of RGB-D Panoramic Images from Portable Mobile Mapping – a Perspective for Infrastructure Management
Hochaufgelöste, genau georeferenzierte RGB-D-Bilder sind die Grundlage für 3D-Bildräume bzw. 3D Street-View-Webdienste, welche bereits kommerziell für das Infrastrukturmanagement eingesetzt werden. MMS ermöglichen eine schnelle und effiziente Datenerfassung von Infrastrukturen. Die meisten im Aussenraum eingesetzten MMS beruhen auf direkter Georeferenzierung. Diese ermöglicht in offenen Bereichen absolute Genauigkeiten im Zentimeterbereich. Bei GNSS-Abschattung fällt die Genauigkeit der direkten Georeferenzierung jedoch schnell in den Dezimeter- oder sogar in den Meterbereich. In Innenräumen eingesetzte MMS basieren hingegen meist auf SLAM. Die meisten SLAM-Algorithmen wurden jedoch für niedrige Latenzzeiten und für Echtzeitleistung optimiert und nehmen daher Abstriche bei der Genauigkeit, der Kartenqualität und der maximalen Ausdehnung in Kauf.
Das Ziel dieser Arbeit ist, hochaufgelöste RGB-D-Bilder in verschiedenen Umgebungen zu erfassen und diese genau und zuverlässig zu georeferenzieren.
Für die Datenerfassung wurde ein leistungsstarkes, bildfokussiertes und rucksackgetragenes MMS entwickelt. Dieses besteht aus einer Mehrkopf-Panoramakamera, zwei Multi-Beam LiDAR-Scannern und einer GNSS- und IMU-kombinierten Navigationseinheit der taktischen Leistungsklasse. Alle Sensoren sind präzise synchronisiert und ermöglichen Zugriff auf die Rohdaten. Das Gesamtsystem wurde in Testfeldern mit bündelblockbasierten sowie merkmalsbasierten Methoden kalibriert, was eine Voraussetzung für die Integration kinematischer Sensordaten darstellt.
Für eine genaue und zuverlässige Georeferenzierung in verschiedenen Umgebungen wurde ein mehrstufiger Georeferenzierungsansatz entwickelt, welcher verschiedene Sensordaten und Georeferenzierungsmethoden vereint. Direkte und LiDAR SLAM-basierte Georeferenzierung liefern Initialposen für die nachträgliche bildbasierte Georeferenzierung mittels erweiterter SfM-Pipeline. Die bildbasierte Georeferenzierung führt zu einer präzisen aber spärlichen Trajektorie, welche sich für die Georeferenzierung von Bildern eignet. Um eine dichte Trajektorie zu erhalten, die sich auch für die Georeferenzierung von LiDAR-Daten eignet, wurde die direkte Georeferenzierung mit Posen der bildbasierten Georeferenzierung gestützt.
Umfassende Leistungsuntersuchungen in drei weiträumigen anspruchsvollen Testgebieten zeigen die Möglichkeiten und Grenzen unseres Georeferenzierungsansatzes. Die drei Testgebiete im Stadtzentrum, im Wald und im Gebäude repräsentieren reale Bedingungen mit eingeschränktem GNSS-Empfang, schlechter Beleuchtung, sich bewegenden Objekten und sich wiederholenden geometrischen Mustern.
Die bildbasierte Georeferenzierung erzielte die besten Genauigkeiten, wobei die mittlere Präzision im Bereich von 5 mm bis 7 mm lag. Die absolute Genauigkeit betrug 85 mm bis 131 mm, was einer Verbesserung um Faktor 2 bis 7 gegenüber der direkten und LiDAR SLAM-basierten Georeferenzierung entspricht. Die direkte Georeferenzierung mit CUPT-Stützung von Bildposen der bildbasierten Georeferenzierung, führte zu einer leicht verschlechterten mittleren Präzision im Bereich von 13 mm bis 16 mm, wobei sich die mittlere absolute Genauigkeit nicht signifikant von der bildbasierten Georeferenzierung unterschied.
Die in herausfordernden Umgebungen erzielten Genauigkeiten bestätigen frühere Untersuchungen unter optimalen Bedingungen und liegen in derselben Grössenordnung wie die Resultate anderer Forschungsgruppen. Sie können für die Erstellung von Street-View-Services in herausfordernden Umgebungen für das Infrastrukturmanagement verwendet werden. Genau und zuverlässig georeferenzierte RGB-D-Bilder haben ein grosses Potenzial für zukünftige visuelle Lokalisierungs- und AR-Anwendungen
- …