451 research outputs found

    Multitemporal Relearning with Convolutional LSTM Models for Land Use Classification

    Get PDF
    In this article, we present a novel hybrid framework, which integrates spatial–temporal semantic segmentation with postclassification relearning, for multitemporal land use and land cover (LULC) classification based on very high resolution (VHR) satellite imagery. To efficiently obtain optimal multitemporal LULC classification maps, the hybrid framework utilizes a spatial–temporal semantic segmentation model to harness temporal dependency for extracting high-level spatial–temporal features. In addition, the principle of postclassification relearning is adopted to efficiently optimize model output. Thereby, the initial outcome of a semantic segmentation model is provided to a subsequent model via an extended input space to guide the learning of discriminative feature representations in an end-to-end fashion. Last, object-based voting is coupled with postclassification relearning for coping with the high intraclass and low interclass variances. The framework was tested with two different postclassification relearning strategies (i.e., pixel-based relearning and object-based relearning) and three convolutional neural network models, i.e., UNet, a simple Convolutional LSTM, and a UNet Convolutional-LSTM. The experiments were conducted on two datasets with LULC labels that contain rich semantic information and variant building morphologic features (e.g., informal settlements). Each dataset contains four time steps from WorldView-2 and Quickbird imagery. The experimental results unambiguously underline that the proposed framework is efficient in terms of classifying complex LULC maps with multitemporal VHR images

    Application of Convolutional Neural Network in the Segmentation and Classification of High-Resolution Remote Sensing Images

    Get PDF
    Numerous convolution neural networks increase accuracy of classification for remote sensing scene images at the expense of the models space and time sophistication This causes the model to run slowly and prevents the realization of a trade-off among model accuracy and running time The loss of deep characteristics as the network gets deeper makes it impossible to retrieve the key aspects with a sample double branching structure which is bad for classifying remote sensing scene photo

    Semi-supervised learning with constrained virtual support vector machines for classification of remote sensing image data

    Get PDF
    We introduce two semi-supervised models for the classification of remote sensing image data. The models are built upon the framework of Virtual Support Vector Machines (VSVM). Generally, VSVM follow a two-step learning procedure: A Support Vector Machines (SVM) model is learned to determine and extract labeled samples that constitute the decision boundary with the maximum margin between thematic classes, i.e., the Support Vectors (SVs). The SVs govern the creation of so-called virtual samples. This is done by modifying, i.e., perturbing, the image features to which a decision boundary needs to be invariant. Subsequently, the classification model is learned for a second time by using the newly created virtual samples in addition to the SVs to eventually find a new optimal decision boundary. Here, we extend this concept by (i) integrating a constrained set of semilabeled samples when establishing the final model. Thereby, the model constrainment, i.e., the selection mechanism for including solely informative semi-labeled samples, is built upon a self-learning procedure composed of two active learning heuristics. Additionally, (ii) we consecutively deploy semi-labeled samples for the creation of semi-labeled virtual samples by modifying the image features of semi-labeled samples that have become semi-labeled SVs after an initial model run. We present experimental results from classifying two multispectral data sets with a sub-meter geometric resolution. The proposed semi-supervised VSVM models exhibit the most favorable performance compared to related SVM and VSVM-based approaches, as well as (semi-)supervised CNNs, in situations with a very limited amount of available prior knowledge, i.e., labeled samples

    X-ModalNet: A Semi-Supervised Deep Cross-Modal Network for Classification of Remote Sensing Data

    Get PDF
    This paper addresses the problem of semi-supervised transfer learning with limited cross-modality data in remote sensing. A large amount of multi-modal earth observation images, such as multispectral imagery (MSI) or synthetic aperture radar (SAR) data, are openly available on a global scale, enabling parsing global urban scenes through remote sensing imagery. However, their ability in identifying materials (pixel-wise classification) remains limited, due to the noisy collection environment and poor discriminative information as well as limited number of well-annotated training images. To this end, we propose a novel cross-modal deep-learning framework, called X-ModalNet, with three well-designed modules: self-adversarial module, interactive learning module, and label propagation module, by learning to transfer more discriminative information from a small-scale hyperspectral image (HSI) into the classification task using a large-scale MSI or SAR data. Significantly, X-ModalNet generalizes well, owing to propagating labels on an updatable graph constructed by high-level features on the top of the network, yielding semi-supervised cross-modality learning. We evaluate X-ModalNet on two multi-modal remote sensing datasets (HSI-MSI and HSI-SAR) and achieve a significant improvement in comparison with several state-of-the-art methods

    Coarse-to-fine classification of road infrastructure elements from mobile point clouds using symmetric ensemble point network and euclidean cluster extraction

    Get PDF
    Classifying point clouds obtained from mobile laser scanning of road environments is a fundamental yet challenging problem for road asset management and unmanned vehicle navigation. Deep learning networks need no prior knowledge to classify multiple objects, but often generate a certain amount of false predictions. However, traditional clustering methods often involve leveraging a priori knowledge, but may lack generalisability compared to deep learning networks. This paper presents a classification method that coarsely classifies multiple objects of road infrastructure with a symmetric ensemble point (SEP) network and then refines the results with a Euclidean cluster extraction (ECE) algorithm. The SEP network applies a symmetric function to capture relevant structural features at different scales and select optimal sub-samples using an ensemble method. The ECE subsequently adjusts points that have been predicted incorrectly by the first step. The experimental results indicate that this method effectively extracts six types of road infrastructure elements: road surfaces, buildings, walls, traffic signs, trees and streetlights. The overall accuracy of the SEP-ECE method improves by 3.97% with respect to PointNet. The achieved average classification accuracy is approximately 99.74%, which is suitable for practical use in transportation network management

    Deep multitask learning with label interdependency distillation for multicriteria street-level image classification

    Get PDF
    Multitask learning (MTL) aims at beneficial joint solving of multiple prediction problems by sharing information across different tasks. However, without adequate consideration of interdependencies, MTL models are prone to miss valuable information. In this paper, we introduce a novel deep MTL architecture that specifically encodes cross-task interdependencies within the setting of multiple image classification problems. Based on task-wise interim class label probability predictions by an intermediately supervised hard parameter sharing convolutional neural network, interdependencies are inferred in two ways: i) by directly stacking label probability sequences to the image feature vector (i.e., multitask stacking), and ii) by passing probability sequences to gated recurrent unit-based recurrent neural networks to explicitly learn cross-task interdependency representations and stacking those to the image feature vector (i.e., interdependency representation learning). The proposed MTL architecture is applied as a tool for generic multi-criteria building characterization using street-level imagery related to risk assessments toward multiple natural hazards. Experimental results for classifying buildings according to five vulnerability-related target variables (i.e., five learning tasks), namely height, lateral load-resisting system material, seismic building structural type, roof shape, and block position are obtained for the Chilean capital Santiago de Chile. Our MTL methods with cross-task label interdependency modeling consistently outperform single task learning (STL) and classical hard parameter sharing MTL alike. Even when starting already from high classification accuracy levels, estimated generalization capabilities can be further improved by considerable margins of accumulated task-specific residuals beyond +6% Îş. Thereby, the combination of multitask stacking and interdependency representation learning attains the highest accuracy estimates for the addressed task and data setting (up to cross-task accuracy mean values of 88.43% overall accuracy and 84.49% Îş). From an efficiency perspective, the proposed MTL methods turn out to be substantially favorable compared to STL in terms of training time consumption

    An incremental learning framework to enhance teaching by demonstration based on multimodal sensor fusion

    Get PDF
    Though a robot can reproduce the demonstration trajectory from a human demonstrator by teleoperation, there is a certain error between the reproduced trajectory and the desired trajectory. To minimize this error, we propose a multimodal incremental learning framework based on a teleoperation strategy that can enable the robot to reproduce the demonstration task accurately. The multimodal demonstration data are collected from two different kinds of sensors in the demonstration phase. Then, the Kalman filter (KF) and dynamic time warping (DTW) algorithms are used to preprocessing the data for the multiple sensor signals. The KF algorithm is mainly used to fuse sensor data of different modalities, and the DTW algorithm is used to align the data in the same timeline. The preprocessed demonstration data are further trained and learned by the incremental learning network and sent to a Baxter robot for reproducing the task demonstrated by the human. Comparative experiments have been performed to verify the effectiveness of the proposed framework

    Multi-target regressor chains with repetitive permutation scheme for characterization of built environments with remote sensing

    Get PDF
    Multi-task learning techniques allow the beneficial joint estimation of multiple target variables. Here, we propose a novel multi-task regression (MTR) method called ensemble of regressor chains with repetitive permutation scheme. It belongs to the family of problem transformation based MTR methods which foresee the creation of an individual model per target variable. Subsequently, the combination of the separate models allows obtaining an overall prediction. Our method builds upon the concept of so-called ensemble of regressor chains which align single-target models along a flexible permutation, i.e., chain. However, in order to particularly address situations with a small number of target variables, we equip ensemble of regressor chains with a repetitive permutation scheme. Thereby, estimates of the target variables are cascaded to subsequent models as additional features when learning along a chain, whereby one target variable can occupy multiple elements of the chain. We provide experimental evaluation of the method by jointly estimating built-up height and built-up density based on features derived from Sentinel-2 data for the four largest cities in Germany in a comparative setup. We also consider single-target stacking, multi-target stacking, and ensemble of regressor chains without repetitive permutation. Empirical results underline the beneficial performance properties of MTR methods. Our ensemble of regressor chain with repetitive permutation scheme approach achieved most frequently the highest accuracies compared to the other MTR methods, whereby mean improvements across the experiments of 14.5% compared to initial single-target models could be achieved

    A Review of Landcover Classification with Very-High Resolution Remotely Sensed Optical Images—Analysis Unit, Model Scalability and Transferability

    Get PDF
    As an important application in remote sensing, landcover classification remains one of the most challenging tasks in very-high-resolution (VHR) image analysis. As the rapidly increasing number of Deep Learning (DL) based landcover methods and training strategies are claimed to be the state-of-the-art, the already fragmented technical landscape of landcover mapping methods has been further complicated. Although there exists a plethora of literature review work attempting to guide researchers in making an informed choice of landcover mapping methods, the articles either focus on the review of applications in a specific area or revolve around general deep learning models, which lack a systematic view of the ever advancing landcover mapping methods. In addition, issues related to training samples and model transferability have become more critical than ever in an era dominated by data-driven approaches, but these issues were addressed to a lesser extent in previous review articles regarding remote sensing classification. Therefore, in this paper, we present a systematic overview of existing methods by starting from learning methods and varying basic analysis units for landcover mapping tasks, to challenges and solutions on three aspects of scalability and transferability with a remote sensing classification focus including (1) sparsity and imbalance of data; (2) domain gaps across different geographical regions; and (3) multi-source and multi-view fusion. We discuss in detail each of these categorical methods and draw concluding remarks in these developments and recommend potential directions for the continued endeavor
    • …
    corecore