Search CORE

5,935 research outputs found

Building with Drones: Accurate 3D Facade Reconstruction using MAVs

Author: Bischof Horst
Daftry Shreyansh
Hoppe Christof
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/02/2015
Field of study

Automatic reconstruction of 3D models from images using multi-view Structure-from-Motion methods has been one of the most fruitful outcomes of computer vision. These advances combined with the growing popularity of Micro Aerial Vehicles as an autonomous imaging platform, have made 3D vision tools ubiquitous for large number of Architecture, Engineering and Construction applications among audiences, mostly unskilled in computer vision. However, to obtain high-resolution and accurate reconstructions from a large-scale object using SfM, there are many critical constraints on the quality of image data, which often become sources of inaccuracy as the current 3D reconstruction pipelines do not facilitate the users to determine the fidelity of input data during the image acquisition. In this paper, we present and advocate a closed-loop interactive approach that performs incremental reconstruction in real-time and gives users an online feedback about the quality parameters like Ground Sampling Distance (GSD), image redundancy, etc on a surface mesh. We also propose a novel multi-scale camera network design to prevent scene drift caused by incremental map building, and release the first multi-scale image sequence dataset as a benchmark. Further, we evaluate our system on real outdoor scenes, and show that our interactive pipeline combined with a multi-scale camera network approach provides compelling accuracy in multi-view reconstruction tasks when compared against the state-of-the-art methods.Comment: 8 Pages, 2015 IEEE International Conference on Robotics and Automation (ICRA '15), Seattle, WA, US

arXiv.org e-Print Archive

Crossref

Evaluating Point Cloud Quality via Transformational Complexity

Author: Xu Xiaozhong
Xu Yiling
Yang Le
Yang Qi
Zhang Yujie
Zhou Yifei
Publication venue
Publication date: 10/10/2022
Field of study

Full-reference point cloud quality assessment (FR-PCQA) aims to infer the quality of distorted point clouds with available references. Merging the research of cognitive science and intuition of the human visual system (HVS), the difference between the expected perceptual result and the practical perception reproduction in the visual center of the cerebral cortex indicates the subjective quality degradation. Therefore in this paper, we try to derive the point cloud quality by measuring the complexity of transforming the distorted point cloud back to its reference, which in practice can be approximated by the code length of one point cloud when the other is given. For this purpose, we first segment the reference and the distorted point cloud into a series of local patch pairs based on one 3D Voronoi diagram. Next, motivated by the predictive coding theory, we utilize one space-aware vector autoregressive (SA-VAR) model to encode the geometry and color channels of each reference patch in cases with and without the distorted patch, respectively. Specifically, supposing that the residual errors follow the multi-variate Gaussian distributions, we calculate the self-complexity of the reference and the transformational complexity between the reference and the distorted sample via covariance matrices. Besides the complexity terms, the prediction terms generated by SA-VAR are introduced as one auxiliary feature to promote the final quality prediction. Extensive experiments on five public point cloud quality databases demonstrate that the transformational complexity based distortion metric (TCDM) produces state-of-the-art (SOTA) results, and ablation studies have further shown that our metric can be generalized to various scenarios with consistent performance by examining its key modules and parameters

arXiv.org e-Print Archive

Point cloud geometry compression using neural implicit representations

Author: MOTAMENI AMIRHOSSEIN
Publication venue
Publication date: 24/10/2023
Field of study

openIn recent years, the increasing prominence of 3D point clouds in various applications has led to an escalating need for efficient storage and transmission methods. The sheer size of these point cloud datasets presents challenges in rendering, transmission, and general usability. This thesis introduces a novel approach to point cloud geometry compression leveraging neural implicit representations, specifically through the use of a DiGS network model. By training this model on a single point cloud, we achieve a compact neural representation of its geometry. Notably, this representation allows for the reconstruction of the point cloud with an arbitrary resolution. After training a reconstructing network, dynamic quantization is applied on the trained weights, significantly reducing its overall bitrate without strongly compromising the quality of the reconstructed point cloud. A dequantization is then used to rebuild a high-fidelity representation of the original point cloud. Our experimental results demonstrate the efficacy of this approach in terms of compression ratios and reconstruction quality, assessed using PSNR relative to the bitrate. This research provides a promising direction for efficient point cloud geometry storage and transmission, addressing some of the growing demands of the 3D data era

Padua Thesis and Dissertation Archive

Micro Fourier Transform Profilometry ( $\mu$ FTP): 3D shape measurement at 10,000 frames per second

Author: Asundi Anand
Chen Qian
Feng Shijie
Huang Lei
Tao Tianyang
Zuo Chao
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

Recent advances in imaging sensors and digital light projection technology have facilitated a rapid progress in 3D optical sensing, enabling 3D surfaces of complex-shaped objects to be captured with improved resolution and accuracy. However, due to the large number of projection patterns required for phase recovery and disambiguation, the maximum fame rates of current 3D shape measurement techniques are still limited to the range of hundreds of frames per second (fps). Here, we demonstrate a new 3D dynamic imaging technique, Micro Fourier Transform Profilometry (

\mu

FTP), which can capture 3D surfaces of transient events at up to 10,000 fps based on our newly developed high-speed fringe projection system. Compared with existing techniques,

\mu

FTP has the prominent advantage of recovering an accurate, unambiguous, and dense 3D point cloud with only two projected patterns. Furthermore, the phase information is encoded within a single high-frequency fringe image, thereby allowing motion-artifact-free reconstruction of transient events with temporal resolution of 50 microseconds. To show

\mu

FTP's broad utility, we use it to reconstruct 3D videos of 4 transient scenes: vibrating cantilevers, rotating fan blades, bullet fired from a toy gun, and balloon's explosion triggered by a flying dart, which were previously difficult or even unable to be captured with conventional approaches.Comment: This manuscript was originally submitted on 30th January 1

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)

Real Time Structured Light and Applications

Author: Wilm Jakob
Publication venue: Technical University of Denmark
Publication date: 01/01/2016
Field of study

Online Research Database In Technology

Increasing the Efficiency of 6-DoF Visual Localization Using Multi-Modal Sensory Data

Author: Clark Ronald
Markham Andrew
Trigoni Niki
Wang Sen
Wen Hongkai
Publication venue
Publication date: 29/12/2016
Field of study

Localization is a key requirement for mobile robot autonomy and human-robot interaction. Vision-based localization is accurate and flexible, however, it incurs a high computational burden which limits its application on many resource-constrained platforms. In this paper, we address the problem of performing real-time localization in large-scale 3D point cloud maps of ever-growing size. While most systems using multi-modal information reduce localization time by employing side-channel information in a coarse manner (eg. WiFi for a rough prior position estimate), we propose to inter-weave the map with rich sensory data. This multi-modal approach achieves two key goals simultaneously. First, it enables us to harness additional sensory data to localise against a map covering a vast area in real-time; and secondly, it also allows us to roughly localise devices which are not equipped with a camera. The key to our approach is a localization policy based on a sequential Monte Carlo estimator. The localiser uses this policy to attempt point-matching only in nodes where it is likely to succeed, significantly increasing the efficiency of the localization process. The proposed multi-modal localization system is evaluated extensively in a large museum building. The results show that our multi-modal approach not only increases the localization accuracy but significantly reduces computational time.Comment: Presented at IEEE-RAS International Conference on Humanoid Robots (Humanoids) 201

arXiv.org e-Print Archive

Crossref

Heriot Watt Pure

Oxford University Research Archive