11 research outputs found

    HDPV-SLAM: Hybrid Depth-augmented Panoramic Visual SLAM for Mobile Mapping System with Tilted LiDAR and Panoramic Visual Camera

    Full text link
    This paper proposes a novel visual simultaneous localization and mapping (SLAM) system called Hybrid Depth-augmented Panoramic Visual SLAM (HDPV-SLAM), that employs a panoramic camera and a tilted multi-beam LiDAR scanner to generate accurate and metrically-scaled trajectories. RGB-D SLAM was the design basis for HDPV-SLAM, which added depth information to visual features. It aims to solve the two major issues hindering the performance of similar SLAM systems. The first obstacle is the sparseness of LiDAR depth, which makes it difficult to correlate it with the extracted visual features of the RGB image. A deep learning-based depth estimation module for iteratively densifying sparse LiDAR depth was suggested to address this issue. The second issue pertains to the difficulties in depth association caused by a lack of horizontal overlap between the panoramic camera and the tilted LiDAR sensor. To surmount this difficulty, we present a hybrid depth association module that optimally combines depth information estimated by two independent procedures, feature-based triangulation and depth estimation. During a phase of feature tracking, this hybrid depth association module aims to maximize the use of more accurate depth information between the triangulated depth with visual features tracked and the deep learning-based corrected depth. We evaluated the efficacy of HDPV-SLAM using the 18.95 km-long York University and Teledyne Optech (YUTO) MMS dataset. The experimental results demonstrate that the two proposed modules contribute substantially to the performance of HDPV-SLAM, which surpasses that of the state-of-the-art (SOTA) SLAM systems.Comment: 8 pages, 3 figures, To be published in IEEE International Conference on Automation Science and Engineering (CASE) 202

    An Adaptive Refinement Scheme for Depth Estimation Networks

    No full text
    Deep learning has proved to be a breakthrough in depth generation. However, the generalization ability of deep networks is still limited, and they cannot maintain a satisfactory performance on some inputs. By addressing a similar problem in the segmentation field, a feature backpropagating refinement scheme (f-BRS) has been proposed to refine predictions in the inference time. f-BRS adapts an intermediate activation function to each input by using user clicks as sparse labels. Given the similarity between user clicks and sparse depth maps, this paper aims to extend the application of f-BRS to depth prediction. Our experiments show that f-BRS, fused with a depth estimation baseline, is trapped in local optima, and fails to improve the network predictions. To resolve that, we propose a double-stage adaptive refinement scheme (DARS). In the first stage, a Delaunay-based correction module significantly improves the depth generated by a baseline network. In the second stage, a particle swarm optimizer (PSO) delineates the estimation through fine-tuning f-BRS parameters—that is, scales and biases. DARS is evaluated on an outdoor benchmark, KITTI, and an indoor benchmark, NYUv2, while for both, the network is pre-trained on KITTI. The proposed scheme was effective on both datasets

    DEM orientation based on local feature correspondence with global DEMs

    No full text
    Thanks to rational polynomial coefficients (RPCs), which are provided by vendors to end users, digital elevation models (DEMs) can be simply derived from satellite stereo images. However, DEMs are influenced by systematic errors in the rational function model (RFM), known as RPC biases. Global DEMs (GDEMs), such as the Shuttle Radar Topography Mission (SRTM), which is the most inexpensive solution, can be applied to improve the accuracy of the relative RFM-derived DEMs. In this article, an automatic and robust local feature-based DEM matching and orientation approach is proposed in order to improve the accuracy of the relative RFM-derived DEMs without the use of ground control points (GCPs). The proposed approach consists of four main steps: (1) combined local feature extraction; (2) computation of the distinctive order-based self-similarity (DOBSS) descriptor; (3) a feature correspondence and local consistency checking process; and (4) a relative RFM-derived DEM orientation process using three-dimensional (3D) transformation models, including 3D rigid, 3D similarity and 3D affine transformations. This technique can avoid the sensitivity of conventional 3D DEM matching methods to initial values, monotonous areas and local distortions. Experimental results on two CARTOSAT-1 derived DEMs demonstrate the superior performance of the proposed DEM matching method over state-of-the-art methods, including SIFT, DAISY, LIOP, LBP, and BRISK descriptors, in terms of the number of correct matches (NCM) and DEM orientation accuracy. The results also show that the proposed method is able to significantly improve the geometric accuracy of the relative RFM-derived DEMs

    An Adaptive Refinement Scheme for Depth Estimation Networks

    No full text
    Deep learning has proved to be a breakthrough in depth generation. However, the generalization ability of deep networks is still limited, and they cannot maintain a satisfactory performance on some inputs. By addressing a similar problem in the segmentation field, a feature backpropagating refinement scheme (f-BRS) has been proposed to refine predictions in the inference time. f-BRS adapts an intermediate activation function to each input by using user clicks as sparse labels. Given the similarity between user clicks and sparse depth maps, this paper aims to extend the application of f-BRS to depth prediction. Our experiments show that f-BRS, fused with a depth estimation baseline, is trapped in local optima, and fails to improve the network predictions. To resolve that, we propose a double-stage adaptive refinement scheme (DARS). In the first stage, a Delaunay-based correction module significantly improves the depth generated by a baseline network. In the second stage, a particle swarm optimizer (PSO) delineates the estimation through fine-tuning f-BRS parameters—that is, scales and biases. DARS is evaluated on an outdoor benchmark, KITTI, and an indoor benchmark, NYUv2, while for both, the network is pre-trained on KITTI. The proposed scheme was effective on both datasets

    An Efficient Multi-Sensor Remote Sensing Image Clustering in Urban Areas via Boosted Convolutional Autoencoder (BCAE)

    No full text
    High-resolution urban image clustering has remained a challenging task. This is mainly because its performance strongly depends on the discrimination power of features. Recently, several studies focused on unsupervised learning methods by autoencoders to learn and extract more efficient features for clustering purposes. This paper proposes a Boosted Convolutional AutoEncoder (BCAE) method based on feature learning for efficient urban image clustering. The proposed method was applied to multi-sensor remote-sensing images through a multistep workflow. The optical data were first preprocessed by applying a Minimum Noise Fraction (MNF) transformation. Then, these MNF features, in addition to the normalized Digital Surface Model (nDSM) and vegetation indexes such as Normalized Difference Vegetation Index (NDVI) and Excess Green (ExG(2)), were used as the inputs of the BCAE model. Next, our proposed convolutional autoencoder was trained to automatically encode upgraded features and boost the hand-crafted features for producing more clustering-friendly ones. Then, we employed the Mini Batch K-Means algorithm to cluster deep features. Finally, the comparative feature sets were manually designed in three modes to prove the efficiency of the proposed method in extracting compelling features. Experiments on three datasets show the efficiency of BCAE for feature learning. According to the experimental results, by applying the proposed method, the ultimate features become more suitable for clustering, and spatial correlation among the pixels in the feature learning process is also considered

    3-D Convolution-Recurrent Networks for Spectral-Spatial Classification of Hyperspectral Images

    No full text
    Nowadays, 3-D convolutional neural networks (3-D CNN) have attracted lots of attention in the spectral-spatial classification of hyperspectral imageries (HSI). In this model, the feed-forward processing structure reduces the computational burden of 3-D structural processing. However, this model as a vector-based methodology cannot analyze the full content of the HSI information, and as a result, its features are not quite discriminative. On the other hand, convolutional long short-term memory (CLSTM) can recurrently analyze the 3-D structural data to extract more discriminative and abstract features. However, the computational burden of this model as a sequence-based methodology is extremely high. In the meanwhile, the robust spectral-spatial feature extraction with a reasonable computational burden is of great interest in HSI classification. For this purpose, a two-stage method based on the integration of CNN and CLSTM is proposed. In the first stage, 3-D CNN is applied to extract low-dimensional shallow spectral-spatial features from HSI, where information on the spatial features are less than that of the spectral information; consequently, in the second stage, the CLSTM, for the first time, is applied to recurrently analyze the spatial information while considering the spectral one. The experimental results obtained from three widely used HSI datasets indicate that the application of the recurrent analysis for spatial feature extractions makes the proposed model robust against different spatial sizes of the extracted patches. Moreover, applying the 3-D CNN prior to the CLSTM efficiently reduces the model’s computational burden. The experimental results also indicated that the proposed model led to a 1% to 2% improvement compared to its counterpart models

    Application of 30-meter global digital elevation models for compensating rational polynomial coefficients biases

    No full text
    Generation of precise digital elevation models (DEMs) from stereo satellite images by using rational polynomial coefficients (RPCs) usually needs several ground control points (GCPs). This is mainly due to RPCs biases. However, since GCPs collection is a time consuming and expensive process, global DEMs (GDEMs), as the most inexpensive geospatial information, can be used to improve stereo satellite imagery-based DEMs (IB-DEMs). In this study, a 2.5 D mutual information based DEM matching, between a GDEM and an IB-DEM, was introduced for bias correction of satellite stereo images. Three well-known 30-meter GDEMs, namely, SRTM, ASTER, and AW3D30, were used and compared to assess the efficiency of this approach. The performance of the proposed method was evaluated by processing the stereo images acquired by CARTOSAT-1 satellite from two regions with flat, hilly, and mountainous topography. Evaluation results revealed that the proposed method could significantly improve the geometric accuracy of IB-DEM using all GDEMs
    corecore