10,593 research outputs found
Dynamic Convolution Self-Attention Network for Land-Cover Classification in VHR Remote-Sensing Images
The current deep convolutional neural networks for very-high-resolution (VHR) remote-sensing image land-cover classification often suffer from two challenges. First, the feature maps extracted by network encoders based on vanilla convolution usually contain a lot of redundant information, which easily causes misclassification of land cover. Moreover, these encoders usually require a large number of parameters and high computational costs. Second, as remote-sensing images are complex and contain many objects with large-scale variances, it is difficult to use the popular feature fusion modules to improve the representation ability of networks. To address the above issues, we propose a dynamic convolution self-attention network (DCSA-Net) for VHR remote-sensing image land-cover classification. The proposed network has two advantages. On one hand, we designed a lightweight dynamic convolution module (LDCM) by using dynamic convolution and a self-attention mechanism. This module can extract more useful image features than vanilla convolution, avoiding the negative effect of useless feature maps on land-cover classification. On the other hand, we designed a context information aggregation module (CIAM) with a ladder structure to enlarge the receptive field. This module can aggregate multi-scale contexture information from feature maps with different resolutions using a dense connection. Experiment results show that the proposed DCSA-Net is superior to state-of-the-art networks due to higher accuracy of land-cover classification, fewer parameters, and lower computational cost. The source code is made public available.National Natural Science Foundation of China (Program No. 61871259, 62271296, 61861024), in part by Natural Science Basic Research Program of Shaanxi (Program No. 2021JC-47), in part by Key Research and Development Program of Shaanxi (Program No. 2022GY-436, 2021ZDLGY08-07), in part by Natural Science Basic Research Program of Shaanxi (Program No. 2022JQ-634, 2022JQ-018), and in part by Shaanxi Joint Laboratory of Artificial Intelligence (No. 2020SS-03)
A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community
In recent years, deep learning (DL), a re-branding of neural networks (NNs),
has risen to the top in numerous areas, namely computer vision (CV), speech
recognition, natural language processing, etc. Whereas remote sensing (RS)
possesses a number of unique challenges, primarily related to sensors and
applications, inevitably RS draws from many of the same theories as CV; e.g.,
statistics, fusion, and machine learning, to name a few. This means that the RS
community should be aware of, if not at the leading edge of, of advancements
like DL. Herein, we provide the most comprehensive survey of state-of-the-art
RS DL research. We also review recent new developments in the DL field that can
be used in DL for RS. Namely, we focus on theories, tools and challenges for
the RS community. Specifically, we focus on unsolved challenges and
opportunities as it relates to (i) inadequate data sets, (ii)
human-understandable solutions for modelling physical phenomena, (iii) Big
Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and
learning algorithms for spectral, spatial and temporal data, (vi) transfer
learning, (vii) an improved theoretical understanding of DL systems, (viii)
high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote
Sensin
Land cover mapping at very high resolution with rotation equivariant CNNs: towards small yet accurate models
In remote sensing images, the absolute orientation of objects is arbitrary.
Depending on an object's orientation and on a sensor's flight path, objects of
the same semantic class can be observed in different orientations in the same
image. Equivariance to rotation, in this context understood as responding with
a rotated semantic label map when subject to a rotation of the input image, is
therefore a very desirable feature, in particular for high capacity models,
such as Convolutional Neural Networks (CNNs). If rotation equivariance is
encoded in the network, the model is confronted with a simpler task and does
not need to learn specific (and redundant) weights to address rotated versions
of the same object class. In this work we propose a CNN architecture called
Rotation Equivariant Vector Field Network (RotEqNet) to encode rotation
equivariance in the network itself. By using rotating convolutions as building
blocks and passing only the the values corresponding to the maximally
activating orientation throughout the network in the form of orientation
encoding vector fields, RotEqNet treats rotated versions of the same object
with the same filter bank and therefore achieves state-of-the-art performances
even when using very small architectures trained from scratch. We test RotEqNet
in two challenging sub-decimeter resolution semantic labeling problems, and
show that we can perform better than a standard CNN while requiring one order
of magnitude less parameters
Learning Spectral-Spatial-Temporal Features via a Recurrent Convolutional Neural Network for Change Detection in Multispectral Imagery
Change detection is one of the central problems in earth observation and was
extensively investigated over recent decades. In this paper, we propose a novel
recurrent convolutional neural network (ReCNN) architecture, which is trained
to learn a joint spectral-spatial-temporal feature representation in a unified
framework for change detection in multispectral images. To this end, we bring
together a convolutional neural network (CNN) and a recurrent neural network
(RNN) into one end-to-end network. The former is able to generate rich
spectral-spatial feature representations, while the latter effectively analyzes
temporal dependency in bi-temporal images. In comparison with previous
approaches to change detection, the proposed network architecture possesses
three distinctive properties: 1) It is end-to-end trainable, in contrast to
most existing methods whose components are separately trained or computed; 2)
it naturally harnesses spatial information that has been proven to be
beneficial to change detection task; 3) it is capable of adaptively learning
the temporal dependency between multitemporal images, unlike most of algorithms
that use fairly simple operation like image differencing or stacking. As far as
we know, this is the first time that a recurrent convolutional network
architecture has been proposed for multitemporal remote sensing image analysis.
The proposed network is validated on real multispectral data sets. Both visual
and quantitative analysis of experimental results demonstrates competitive
performance in the proposed mode
- …