42 research outputs found

    Recurrent Multiresolution Convolutional Networks for VHR Image Classification

    Get PDF
    Classification of very high resolution (VHR) satellite images has three major challenges: 1) inherent low intra-class and high inter-class spectral similarities, 2) mismatching resolution of available bands, and 3) the need to regularize noisy classification maps. Conventional methods have addressed these challenges by adopting separate stages of image fusion, feature extraction, and post-classification map regularization. These processing stages, however, are not jointly optimizing the classification task at hand. In this study, we propose a single-stage framework embedding the processing stages in a recurrent multiresolution convolutional network trained in an end-to-end manner. The feedforward version of the network, called FuseNet, aims to match the resolution of the panchromatic and multispectral bands in a VHR image using convolutional layers with corresponding downsampling and upsampling operations. Contextual label information is incorporated into FuseNet by means of a recurrent version called ReuseNet. We compared FuseNet and ReuseNet against the use of separate processing steps for both image fusion, e.g. pansharpening and resampling through interpolation, and map regularization such as conditional random fields. We carried out our experiments on a land cover classification task using a Worldview-03 image of Quezon City, Philippines and the ISPRS 2D semantic labeling benchmark dataset of Vaihingen, Germany. FuseNet and ReuseNet surpass the baseline approaches in both quantitative and qualitative results

    Vectorizing Planar Roof Structure From Very High Resolution Remote Sensing Images Using Transformers

    Get PDF
    Grasping the roof structure of a building is a key part of building reconstruction. Directly predicting the geometric structure of the roof from a raster image to a vectorized representation, however, remains challenging. This paper introduces an efficient and accurate parsing method based upon a vision Transformer we dubbed Roof-Former. Our method consists of three steps: 1) Image encoder and edge node initialization, 2) Image feature fusion with an enhanced segmentation refinement branch, and 3) Edge filtering and structural reasoning. The vertex and edge heat map F1-scores have increased by 2.0% and 1.9% on the VWB dataset when compared to HEAT. Additionally, qualitative evaluations suggest that our method is superior to the current state-of-the-art. It indicates effectiveness for extracting global image information and maintaining the consistency and topological validity of the roof structure.</p

    GLAVITU:A Hybrid CNN-Transformer for Multi-Regional Glacier Mapping from Multi-Source Data

    Get PDF
    Glacier mapping is essential for studying and monitoring the impacts of climate change. However, several challenges such as debris-covered ice and highly variable landscapes across glacierized regions worldwide complicate large-scale glacier mapping in a fully-automated manner. This work presents a novel hybrid CNN-transformer model (GlaViTU) for multi-regional glacier mapping. Our model outperforms three baseline models - SETR-B/16, ResU-Net and TransU-Net - achieving a higher mean IoU of 0.875 and demonstrates better generalization ability. The proposed model is also parameter-efficient, with approximately 10 and 3 times fewer parameters than SETR-B/16 and ResU-Net, respectively. Our results provide a solid foundation for future studies on the application of deep learning methods for global glacier mapping. To facilitate reproducibility, we have shared our data set, codebase and pretrained models on GitHub at https://github.com/konstantin-a-maslov/GlaViTU-IGARSS2023.</p

    Building polygon extraction from aerial images and digital surface models with a frame field learning framework

    Get PDF
    Deep learning-based models for building delineation from remotely sensed images face the challenge of producing precise and regular building outlines. This study investigates the combination of normalized digital surface models (nDSMs) with aerial images to optimize the extraction of building polygons using the frame field learning method. Results are evaluated at pixel, object, and polygon levels. In addition, an analysis is performed to assess the statistical deviations in the number of vertices of building polygons compared with the reference. The comparison of the number of vertices focuses on finding the output polygons that are the easiest to edit by human analysts in operational applications. It can serve as guidance to reduce the post-processing workload for obtaining high-accuracy building footprints. Experiments conducted in Enschede, the Netherlands, demonstrate that by introducing nDSM, the method could reduce the number of false positives and prevent missing the real buildings on the ground. The positional accuracy and shape similarity was improved, resulting in better-aligned building polygons. The method achieved a mean intersection over union (IoU) of 0.80 with the fused data (RGB + nDSM) against an IoU of 0.57 with the baseline (using RGB only) in the same area. A qualitative analysis of the results shows that the investigated model predicts more precise and regular polygons for large and complex structures

    Investigating Sar-Optical Deep Learning Data Fusion to Map the Brazilian Cerrado Vegetation with Sentinel Data

    Get PDF
    Despite its environmental and societal importance, accurately mapping the Brazilian Cerrado's vegetation is still an open challenge. Its diverse but spectrally similar physiognomies are difficult to be identified and mapped by state-of-the-art methods from only medium-to high-resolution optical images. This work investigates the fusion of Synthetic Aperture Radar (SAR) and optical data in convolutional neural network architectures to map the Cerrado according to a 2-level class hierarchy. Additionally, the proposed model is designed to deal with uncertainties that are brought by the difference in resolution between the input images (at 10m) and the reference data (at 30m). We tested four data fusion strategies and showed that the position for the data combination is important for the network to learn better features.</p

    Batch Mode Active Learning Methods for the Interactive Classification of Remote Sensing Images

    Get PDF
    This paper investigates different batch mode active learning techniques for the classification of remote sensing (RS) images with support vector machines (SVMs). This is done by generalizing to multiclass problems techniques defined for binary classifiers. The investigated techniques exploit different query functions, which are based on the evaluation of two criteria: uncertainty and diversity. The uncertainty criterion is associated to the confidence of the supervised algorithm in correctly classifying the considered sample, while the diversity criterion aims at selecting a set of unlabeled samples that are as more diverse (distant one another) as possible, thus reducing the redundancy among the selected samples. The combination of the two criteria results in the selection of the potentially most informative set of samples at each iteration of the active learning process. Moreover, we propose a novel query function that is based on a kernel clustering technique for assessing the diversity of samples and a new strategy for selecting the most informative representative sample from each cluster. The investigated and proposed techniques are theoretically and experimentally compared with state-of-the-art methods adopted for RS applications. This is accomplished by considering VHR multispectral and hyperspectral images. By this comparison we observed that the proposed method resulted in better accuracy with respect to other investigated and state-of-the art methods on the considered data sets. Furthermore, we derived some guidelines on the design of active learning systems for the classification of different types of RS images

    A Novel Context-Sensitive SVM for Classification of Remote Sensing Images

    Get PDF
    In this paper, a novel context-sensitive classification technique based on Support Vector Machines (CS-SVM) is proposed. This technique aims at exploiting the promising SVM method for classification of 2-D (or n-D) scenes by considering the spatial-context information of the pixel to be analyzed. In greater detail, the proposed architecture properly exploits the spatial-context information for: i) increasing the robustness of the learning procedure of SVMs to the noise present in the training set (mislabeled training samples); ii) regularizing the classification maps. The first property is achieved by introducing a context-sensitive term in the objective function to be minimized for defining the decision hyperplane in the SVM kernel space. The second property is obtained including in the classification procedure of a generic pattern the information of neighboring pixels. Experiments carried out on very high geometrical resolution images confirm the validity of the proposed technique

    Advanced Techniques for the Classification of Very High Resolution and Hyperspectral Remote Sensing Images

    Get PDF
    This thesis is about the classification of the last generation of very high resolution (VHR) and hyperspectral remote sensing (RS) images, which are capable to acquire images characterized by very high resolution from satellite and airborne platforms. In particular, these systems can acquire VHR multispectral images characterized by a geometric resolution in the order or smaller than one meter, and hyperspectral images, characterized by hundreds of bands associated to narrow spectral channels. This type of data allows to precisely characterizing the different materials on the ground and/or the geometrical properties of the different objects (e.g., buildings, streets, agriculture fields, etc.) in the scene under investigation. This remote sensed data provide very useful information for several applications related to the monitoring of the natural environment and of human structures. However, in order to develop real-world applications with VHR and hyperspectral data, it is necessary to define automatic techniques for an efficient and effective analysis of the data. Here, we focus our attention on RS image classification, which is at the basis of most of the applications related to environmental monitoring. Image classification is devoted to translate the features that represent the information present in the data in thematic maps of the land cover types according to the solution of a pattern recognition problem. However, the huge amount of data associated with VHR and hyperspectral RS images makes the classification problem very complex and the available techniques are still inadequate to analyze these kinds of data. For this reason, the general objective of this thesis is to develop novel techniques for the analysis and the classification of VHR and hyperspectral images, in order to improve the capability to automatically extract useful information captured from these data and to exploit it in real applications. Moreover we addressed the classification of RS images in operational conditions where the available reference labeled samples are few and/or not completely reliable (which is quite common in many real problems). In particular, the following specific issues are considered in this work: 1. development of feature selection for the classification of hyperspectral images, for identifying a subset of the original features that exhibits at the same time high capability to discriminate among the considered classes and high invariance in the spatial domain of the scene; 2. classification of RS images when the available training set is not fully reliable, i.e., some labeled samples may be associated to the wrong information class (mislabeled patterns); 3. active learning techniques for interactive classification of RS images. 4. definition of a protocol for accuracy assessment in the classification of VHR images that is based on the analysis of both thematic and geometric accuracy. For each considered topic an in deep study of the literature is carried out and the limitations of currently published methodologies are highlighted. Starting from this analysis, novel solutions are theoretically developed, implemented and applied to real RS data in order to verify their effectiveness. The obtained experimental results confirm the effectiveness of all the proposed techniques

    Deep Fully Convolutional Networks for the Detection of Informal Settlements in VHR Images

    Get PDF
    This letter investigates fully convolutional networks (FCNs) for the detection of informal settlements in very high resolution (VHR) satellite images. Informal settlements or slums are proliferating in developing countries and their detection and classification provides vital information for decision making and planning urban upgrading processes. Distinguishing different urban structures in VHR images is challenging because of the abstract semantic definition of the classes as opposed to the separation of standard land-cover classes. This task requires extraction of texture and spatial features. To this aim, we introduce deep FCNs to perform pixel-wise image labeling by automatically learning a higher level representation of the data. Deep FCNs can learn a hierarchy of features associated to increasing levels of abstraction, from raw pixel values to edges and corners up to complex spatial patterns. We present a deep FCN using dilated convolutions of increasing spatial support. It is capable of learning informative features capturing long-range pixel dependencies while keeping a limited number of network parameters. Experiments carried out on a Quickbird image acquired over the city of Dar es Salaam, Tanzania, show that the proposed FCN outperforms state-of-the-art convolutional networks. Moreover, the computational cost of the proposed technique is significantly lower than standard patch-based architectures
    corecore