124 research outputs found

    Multitemporal Relearning with Convolutional LSTM Models for Land Use Classification

    Get PDF
    In this article, we present a novel hybrid framework, which integrates spatial–temporal semantic segmentation with postclassification relearning, for multitemporal land use and land cover (LULC) classification based on very high resolution (VHR) satellite imagery. To efficiently obtain optimal multitemporal LULC classification maps, the hybrid framework utilizes a spatial–temporal semantic segmentation model to harness temporal dependency for extracting high-level spatial–temporal features. In addition, the principle of postclassification relearning is adopted to efficiently optimize model output. Thereby, the initial outcome of a semantic segmentation model is provided to a subsequent model via an extended input space to guide the learning of discriminative feature representations in an end-to-end fashion. Last, object-based voting is coupled with postclassification relearning for coping with the high intraclass and low interclass variances. The framework was tested with two different postclassification relearning strategies (i.e., pixel-based relearning and object-based relearning) and three convolutional neural network models, i.e., UNet, a simple Convolutional LSTM, and a UNet Convolutional-LSTM. The experiments were conducted on two datasets with LULC labels that contain rich semantic information and variant building morphologic features (e.g., informal settlements). Each dataset contains four time steps from WorldView-2 and Quickbird imagery. The experimental results unambiguously underline that the proposed framework is efficient in terms of classifying complex LULC maps with multitemporal VHR images

    Deep learning for feature extraction in remote sensing: A case-study of aerial scene classification

    Get PDF
    Scene classification relying on images is essential in many systems and applications related to remote sensing. The scientific interest in scene classification from remotely collected images is increasing, and many datasets and algorithms are being developed. The introduction of convolutional neural networks (CNN) and other deep learning techniques contributed to vast improvements in the accuracy of image scene classification in such systems. To classify the scene from areal images, we used a two-stream deep architecture. We performed the first part of the classification, the feature extraction, using pre-trained CNN that extracts deep features of aerial images from different network layers: the average pooling layer or some of the previous convolutional layers. Next, we applied feature concatenation on extracted features from various neural networks, after dimensionality reduction was performed on enormous feature vectors. We experimented extensively with different CNN architectures, to get optimal results. Finally, we used the Support Vector Machine (SVM) for the classification of the concatenated features. The competitiveness of the examined technique was evaluated on two real-world datasets: UC Merced and WHU-RS. The obtained classification accuracies demonstrate that the considered method has competitive results compared to other cutting-edge techniques

    Multi-evidence and multi-modal fusion network for ground-based cloud recognition

    Get PDF
    In recent times, deep neural networks have drawn much attention in ground-based cloud recognition. Yet such kind of approaches simply center upon learning global features from visual information, which causes incomplete representations for ground-based clouds. In this paper, we propose a novel method named multi-evidence and multi-modal fusion network (MMFN) for ground-based cloud recognition, which could learn extended cloud information by fusing heterogeneous features in a unified framework. Namely, MMFN exploits multiple pieces of evidence, i.e., global and local visual features, from ground-based cloud images using the main network and the attentive network. In the attentive network, local visual features are extracted from attentive maps which are obtained by refining salient patterns from convolutional activation maps. Meanwhile, the multi-modal network in MMFN learns multi-modal features for ground-based cloud. To fully fuse the multi-modal and multi-evidence visual features, we design two fusion layers in MMFN to incorporate multi-modal features with global and local visual features, respectively. Furthermore, we release the first multi-modal ground-based cloud dataset named MGCD which not only contains the ground-based cloud images but also contains the multi-modal information corresponding to each cloud image. The MMFN is evaluated on MGCD and achieves a classification accuracy of 88.63% comparative to the state-of-the-art methods, which validates its effectiveness for ground-based cloud recognition
    • …
    corecore