Search CORE

72 research outputs found

Multi-modal Sensor Registration for Vehicle Perception via Deep Neural Networks

Author: Giering Michael
Reddy Kishore
Venugopalan Vivek
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/07/2015
Field of study

The ability to simultaneously leverage multiple modes of sensor information is critical for perception of an automated vehicle's physical surroundings. Spatio-temporal alignment of registration of the incoming information is often a prerequisite to analyzing the fused data. The persistence and reliability of multi-modal registration is therefore the key to the stability of decision support systems ingesting the fused information. LiDAR-video systems like on those many driverless cars are a common example of where keeping the LiDAR and video channels registered to common physical features is important. We develop a deep learning method that takes multiple channels of heterogeneous data, to detect the misalignment of the LiDAR-video inputs. A number of variations were tested on the Ford LiDAR-video driving test data set and will be discussed. To the best of our knowledge the use of multi-modal deep convolutional neural networks for dynamic real-time LiDAR-video registration has not been presented.Comment: 7 pages, double column, IEEE format, accepted at IEEE HPEC 201

arXiv.org e-Print Archive

Crossref

Accurate Single Image Multi-Modal Camera Pose Estimation

Author: B. Leibe
C.P. Lu
D.F. DeMenthon
D.G. Lowe
G. Penney
G. Vosselman
H. Bay
K. Mikolajczyk
P. David
P. Viola
R. Raguram
S. Benhimane
V. Lepetit
Publication venue
Publication date: 01/01/2012
Field of study

Abstract. A well known problem in photogrammetry and computer vision is the precise and robust determination of camera poses with respect to a given 3D model. In this work we propose a novel multi-modal method for single image camera pose estimation with respect to 3D models with intensity information (e.g., LiDAR data with reflectance information). We utilize a direct point based rendering approach to generate synthetic 2D views from 3D datasets in order to bridge the dimensionality gap. The proposed method then establishes 2D/2D point and local region correspondences based on a novel self-similarity distance measure. Correct correspondences are robustly identified by searching for small regions with a similar geometric relationship of local self-similarities using a Generalized Hough Transform. After backprojection of the generated features into 3D a standard Perspective-n-Points problem is solved to yield an initial camera pose. The pose is then accurately refined using an intensity based 2D/3D registration approach. An evaluation on Vis/IR 2D and airborne and terrestrial 3D datasets shows that the proposed method is applicable to a wide range of different sensor types. In addition, the approach outperforms standard global multi-modal 2D/3D registration approaches based on Mutual Information with respect to robustness and speed. Potential applications are widespread and include for instance multispectral texturing of 3D models, SLAM applications, sensor data fusion and multi-spectral camera calibration and super-resolution applications

CiteSeerX

Crossref

Automatic Registration of Optical Aerial Imagery to a LiDAR Point Cloud for Generation of City Models

Author: Abayowa Bernard Olushola
Hardie Russell C.
Yilmaz Alper
Publication venue: eCommons
Publication date: 01/08/2015
Field of study

This paper presents a framework for automatic registration of both the optical and 3D structural information extracted from oblique aerial imagery to a Light Detection and Ranging (LiDAR) point cloud without prior knowledge of an initial alignment. The framework employs a coarse to fine strategy in the estimation of the registration parameters. First, a dense 3D point cloud and the associated relative camera parameters are extracted from the optical aerial imagery using a state-of-the-art 3D reconstruction algorithm. Next, a digital surface model (DSM) is generated from both the LiDAR and the optical imagery-derived point clouds. Coarse registration parameters are then computed from salient features extracted from the LiDAR and optical imagery-derived DSMs. The registration parameters are further refined using the iterative closest point (ICP) algorithm to minimize global error between the registered point clouds. The novelty of the proposed approach is in the computation of salient features from the DSMs, and the selection of matching salient features using geometric invariants coupled with Normalized Cross Correlation (NCC) match validation. The feature extraction and matching process enables the automatic estimation of the coarse registration parameters required for initializing the fine registration process. The registration framework is tested on a simulated scene and aerial datasets acquired in real urban environments. Results demonstrates the robustness of the framework for registering optical and 3D structural information extracted from aerial imagery to a LiDAR point cloud, when co-existing initial registration parameters are unavailable

University of Dayton

Automatic co-registration of aerial imagery and untextured model data utilizing average shading gradients

Author: Ruf Boitumelo
Schmitz S.
Weinmann Martin
Publication venue: ISPRS
Publication date: 01/01/2019
Field of study

The comparison of current image data with existing 3D model data of a scene provides an efficient method to keep models up to date. In order to transfer information between 2D and 3D data, a preliminary co-registration is necessary. In this paper, we present a concept to automatically co-register aerial imagery and untextured 3D model data. To refine a given initial camera pose, our algorithm computes dense correspondence fields using SIFT flow between gradient representations of the model and camera image, from which 2D–3D correspondences are obtained. These correspondences are then used in an iterative optimization scheme to refine the initial camera pose by minimizing the reprojection error. Since it is assumed that the model does not contain texture information, our algorithm is built up on an existing method based on Average Shading Gradients (ASG) to generate gradient images based on raw geometry information only. We apply our algorithm for the co-registering of aerial photographs to an untextured, noisy mesh model. We have investigated different magnitudes of input error and show that the proposed approach can reduce the final reprojection error to a minimum of 1.27 ± 0.54 pixels, which is less than 10% of its initial value. Furthermore, our evaluation shows that our approach outperforms the accuracy of a standard Iterative Closest Point (ICP) implementation

arXiv.org e-Print Archive

KITopen

Automatic registration of Iphone images to LASER point clouds of the urban structures using shape features

Author
Publication venue: 'Copernicus GmbH'
Publication date
Field of study

Crossref

Continuous Modeling of 3D Building Rooftops From Airborne LIDAR and Imagery

Author: Jung Jaewook
Publication venue
Publication date: 20/09/2016
Field of study

In recent years, a number of mega-cities have provided 3D photorealistic virtual models to support the decisions making process for maintaining the cities' infrastructure and environment more effectively. 3D virtual city models are static snap-shots of the environment and represent the status quo at the time of their data acquisition. However, cities are dynamic system that continuously change over time. Accordingly, their virtual representation need to be regularly updated in a timely manner to allow for accurate analysis and simulated results that decisions are based upon. The concept of "continuous city modeling" is to progressively reconstruct city models by accommodating their changes recognized in spatio-temporal domain, while preserving unchanged structures. However, developing a universal intelligent machine enabling continuous modeling still remains a challenging task. Therefore, this thesis proposes a novel research framework for continuously reconstructing 3D building rooftops using multi-sensor data. For achieving this goal, we first proposes a 3D building rooftop modeling method using airborne LiDAR data. The main focus is on the implementation of an implicit regularization method which impose a data-driven building regularity to noisy boundaries of roof planes for reconstructing 3D building rooftop models. The implicit regularization process is implemented in the framework of Minimum Description Length (MDL) combined with Hypothesize and Test (HAT). Secondly, we propose a context-based geometric hashing method to align newly acquired image data with existing building models. The novelty is the use of context features to achieve robust and accurate matching results. Thirdly, the existing building models are refined by newly proposed sequential fusion method. The main advantage of the proposed method is its ability to progressively refine modeling errors frequently observed in LiDAR-driven building models. The refinement process is conducted in the framework of MDL combined with HAT. Markov Chain Monte Carlo (MDMC) coupled with Simulated Annealing (SA) is employed to perform a global optimization. The results demonstrates that the proposed continuous rooftop modeling methods show a promising aspects to support various critical decisions by not only reconstructing 3D rooftop models accurately, but also by updating the models using multi-sensor data

YorkSpace

Lifting GIS Maps into Strong Geometric Context for Scene Understanding

Author: Díaz Raúl
Fowlkes Charless C.
Lee Minhaeng
Schubert Jochen
Publication venue
Publication date: 08/01/2016
Field of study

Contextual information can have a substantial impact on the performance of visual tasks such as semantic segmentation, object detection, and geometric estimation. Data stored in Geographic Information Systems (GIS) offers a rich source of contextual information that has been largely untapped by computer vision. We propose to leverage such information for scene understanding by combining GIS resources with large sets of unorganized photographs using Structure from Motion (SfM) techniques. We present a pipeline to quickly generate strong 3D geometric priors from 2D GIS data using SfM models aligned with minimal user input. Given an image resectioned against this model, we generate robust predictions of depth, surface normals, and semantic labels. We show that the precision of the predicted geometry is substantially more accurate other single-image depth estimation methods. We then demonstrate the utility of these contextual constraints for re-scoring pedestrian detections, and use these GIS contextual features alongside object detection score maps to improve a CRF-based semantic segmentation framework, boosting accuracy over baseline models

arXiv.org e-Print Archive

Crossref

Enhancing Geomatics Techniques for Cultural Heritage through Multiwavelength Recording and Metric Data Fusion

Author: Adamopoulos Efstathios
Publication venue
Publication date: 01/01/2022
Field of study

Institutional Research Information System University of Turin