1,799 research outputs found
Learning to semantically segment high-resolution remote sensing images
Land cover classification is a task that requires methods capable of learning high-level features while dealing with high volume of data. Overcoming these challenges, Convolutional Networks (ConvNets) can learn specific and adaptable features depending on the data while, at the same time, learn classifiers. In this work, we propose a novel technique to automatically perform pixel-wise land cover classification. To the best of our knowledge, there is no other work in the literature that perform pixel-wise semantic segmentation based on data-driven feature descriptors for high-resolution remote sensing images. The main idea is to exploit the power of ConvNet feature representations to learn how to semantically segment remote sensing images. First, our method learns each label in a pixel-wise manner by taking into account the spatial context of each pixel. In a predicting phase, the probability of a pixel belonging to a class is also estimated according to its spatial context and the learned patterns. We conducted a systematic evaluation of the proposed algorithm using two remote sensing datasets with very distinct properties. Our results show that the proposed algorithm provides improvements when compared to traditional and state-of-the-art methods that ranges from 5 to 15% in terms of accuracy
T-UNet: Triplet UNet for Change Detection in High-Resolution Remote Sensing Images
Remote sensing image change detection aims to identify the differences
between images acquired at different times in the same area. It is widely used
in land management, environmental monitoring, disaster assessment and other
fields. Currently, most change detection methods are based on Siamese network
structure or early fusion structure. Siamese structure focuses on extracting
object features at different times but lacks attention to change information,
which leads to false alarms and missed detections. Early fusion (EF) structure
focuses on extracting features after the fusion of images of different phases
but ignores the significance of object features at different times for
detecting change details, making it difficult to accurately discern the edges
of changed objects. To address these issues and obtain more accurate results,
we propose a novel network, Triplet UNet(T-UNet), based on a three-branch
encoder, which is capable to simultaneously extract the object features and the
change features between the pre- and post-time-phase images through triplet
encoder. To effectively interact and fuse the features extracted from the three
branches of triplet encoder, we propose a multi-branch spatial-spectral
cross-attention module (MBSSCA). In the decoder stage, we introduce the channel
attention mechanism (CAM) and spatial attention mechanism (SAM) to fully mine
and integrate detailed textures information at the shallow layer and semantic
localization information at the deep layer.Comment: 21 pages, 11 figures, 6 table
- …