176 research outputs found
Changes to Captions: An Attentive Network for Remote Sensing Change Captioning
In recent years, advanced research has focused on the direct learning and
analysis of remote sensing images using natural language processing (NLP)
techniques. The ability to accurately describe changes occurring in
multi-temporal remote sensing images is becoming increasingly important for
geospatial understanding and land planning. Unlike natural image change
captioning tasks, remote sensing change captioning aims to capture the most
significant changes, irrespective of various influential factors such as
illumination, seasonal effects, and complex land covers. In this study, we
highlight the significance of accurately describing changes in remote sensing
images and present a comparison of the change captioning task for natural and
synthetic images and remote sensing images. To address the challenge of
generating accurate captions, we propose an attentive changes-to-captions
network, called Chg2Cap for short, for bi-temporal remote sensing images. The
network comprises three main components: 1) a Siamese CNN-based feature
extractor to collect high-level representations for each image pair; 2) an
attentive decoder that includes a hierarchical self-attention block to locate
change-related features and a residual block to generate the image embedding;
and 3) a transformer-based caption generator to decode the relationship between
the image embedding and the word embedding into a description. The proposed
Chg2Cap network is evaluated on two representative remote sensing datasets, and
a comprehensive experimental analysis is provided. The code and pre-trained
models will be available online at https://github.com/ShizhenChang/Chg2Cap
Fusion of Heterogeneous Earth Observation Data for the Classification of Local Climate Zones
This paper proposes a novel framework for fusing multi-temporal,
multispectral satellite images and OpenStreetMap (OSM) data for the
classification of local climate zones (LCZs). Feature stacking is the most
commonly-used method of data fusion but does not consider the heterogeneity of
multimodal optical images and OSM data, which becomes its main drawback. The
proposed framework processes two data sources separately and then combines them
at the model level through two fusion models (the landuse fusion model and
building fusion model), which aim to fuse optical images with landuse and
buildings layers of OSM data, respectively. In addition, a new approach to
detecting building incompleteness of OSM data is proposed. The proposed
framework was trained and tested using data from the 2017 IEEE GRSS Data Fusion
Contest, and further validated on one additional test set containing test
samples which are manually labeled in Munich and New York. Experimental results
have indicated that compared to the feature stacking-based baseline framework
the proposed framework is effective in fusing optical images with OSM data for
the classification of LCZs with high generalization capability on a large
scale. The classification accuracy of the proposed framework outperforms the
baseline framework by more than 6% and 2%, while testing on the test set of
2017 IEEE GRSS Data Fusion Contest and the additional test set, respectively.
In addition, the proposed framework is less sensitive to spectral diversities
of optical satellite images and thus achieves more stable classification
performance than state-of-the art frameworks.Comment: accepted by TGR
Feature Selection Based on Hybridization of Genetic Algorithm and Particle Swarm Optimization
A new feature selection approach that is based on the integration of a genetic algorithm and particle swarm optimization is proposed. The overall accuracy of a support vector machine classifier on validation samples is used as a fitness value. The new approach is carried out on the well-known Indian Pines hyperspectral data set. Results confirm that the new approach is able to automatically select the most informative features in terms of classification accuracy within an acceptable CPU processing time without requiring the number of desired features to be set a priori by users. Furthermore, the usefulness of the proposed method is also tested for road detection. Results confirm that the proposed method is capable of discriminating between road and background pixels and performs better than the other approaches used for comparison in terms of performance metrics.Rannís; Rannsóknarnámssjóður / The Icelandic Research Fund for
Graduate Students.PostPrin
Backdoor Attacks for Remote Sensing Data with Wavelet Transform
Recent years have witnessed the great success of deep learning algorithms in
the geoscience and remote sensing realm. Nevertheless, the security and
robustness of deep learning models deserve special attention when addressing
safety-critical remote sensing tasks. In this paper, we provide a systematic
analysis of backdoor attacks for remote sensing data, where both scene
classification and semantic segmentation tasks are considered. While most of
the existing backdoor attack algorithms rely on visible triggers like squared
patches with well-designed patterns, we propose a novel wavelet transform-based
attack (WABA) method, which can achieve invisible attacks by injecting the
trigger image into the poisoned image in the low-frequency domain. In this way,
the high-frequency information in the trigger image can be filtered out in the
attack, resulting in stealthy data poisoning. Despite its simplicity, the
proposed method can significantly cheat the current state-of-the-art deep
learning models with a high attack success rate. We further analyze how
different trigger images and the hyper-parameters in the wavelet transform
would influence the performance of the proposed method. Extensive experiments
on four benchmark remote sensing datasets demonstrate the effectiveness of the
proposed method for both scene classification and semantic segmentation tasks
and thus highlight the importance of designing advanced backdoor defense
algorithms to address this threat in remote sensing scenarios. The code will be
available online at \url{https://github.com/ndraeger/waba}
Sketched Multi-view Subspace Learning for Hyperspectral Anomalous Change Detection
In recent years, multi-view subspace learning has been garnering increasing
attention. It aims to capture the inner relationships of the data that are
collected from multiple sources by learning a unified representation. In this
way, comprehensive information from multiple views is shared and preserved for
the generalization processes. As a special branch of temporal series
hyperspectral image (HSI) processing, the anomalous change detection task
focuses on detecting very small changes among different temporal images.
However, when the volume of datasets is very large or the classes are
relatively comprehensive, existing methods may fail to find those changes
between the scenes, and end up with terrible detection results. In this paper,
inspired by the sketched representation and multi-view subspace learning, a
sketched multi-view subspace learning (SMSL) model is proposed for HSI
anomalous change detection. The proposed model preserves major information from
the image pairs and improves computational complexity by using a sketched
representation matrix. Furthermore, the differences between scenes are
extracted by utilizing the specific regularizer of the self-representation
matrices. To evaluate the detection effectiveness of the proposed SMSL model,
experiments are conducted on a benchmark hyperspectral remote sensing dataset
and a natural hyperspectral dataset, and compared with other state-of-the art
approaches
Dsfer-Net: A Deep Supervision and Feature Retrieval Network for Bitemporal Change Detection Using Modern Hopfield Networks
Change detection, as an important application for high-resolution remote
sensing images, aims to monitor and analyze changes in the land surface over
time. With the rapid growth in the quantity of high-resolution remote sensing
data and the complexity of texture features, a number of quantitative deep
learning-based methods have been proposed. Although these methods outperform
traditional change detection methods by extracting deep features and combining
spatial-temporal information, reasonable explanations about how deep features
work on improving the detection performance are still lacking. In our
investigations, we find that modern Hopfield network layers achieve
considerable performance in semantic understandings. In this paper, we propose
a Deep Supervision and FEature Retrieval network (Dsfer-Net) for bitemporal
change detection. Specifically, the highly representative deep features of
bitemporal images are jointly extracted through a fully convolutional Siamese
network. Based on the sequential geo-information of the bitemporal images, we
then design a feature retrieval module to retrieve the difference feature and
leverage discriminative information in a deeply supervised manner. We also note
that the deeply supervised feature retrieval module gives explainable proofs
about the semantic understandings of the proposed network in its deep layers.
Finally, this end-to-end network achieves a novel framework by aggregating the
retrieved features and feature pairs from different layers. Experiments
conducted on three public datasets (LEVIR-CD, WHU-CD, and CDD) confirm the
superiority of the proposed Dsfer-Net over other state-of-the-art methods. Code
will be available online (https://github.com/ShizhenChang/Dsfer-Net)
Hyperspectral Remote Sensing Benchmark Database for Oil Spill Detection with an Isolation Forest-Guided Unsupervised Detector
Oil spill detection has attracted increasing attention in recent years since
marine oil spill accidents severely affect environments, natural resources, and
the lives of coastal inhabitants. Hyperspectral remote sensing images provide
rich spectral information which is beneficial for the monitoring of oil spills
in complex ocean scenarios. However, most of the existing approaches are based
on supervised and semi-supervised frameworks to detect oil spills from
hyperspectral images (HSIs), which require a huge amount of effort to annotate
a certain number of high-quality training sets. In this study, we make the
first attempt to develop an unsupervised oil spill detection method based on
isolation forest for HSIs. First, considering that the noise level varies among
different bands, a noise variance estimation method is exploited to evaluate
the noise level of different bands, and the bands corrupted by severe noise are
removed. Second, kernel principal component analysis (KPCA) is employed to
reduce the high dimensionality of the HSIs. Then, the probability of each pixel
belonging to one of the classes of seawater and oil spills is estimated with
the isolation forest, and a set of pseudo-labeled training samples is
automatically produced using the clustering algorithm on the detected
probability. Finally, an initial detection map can be obtained by performing
the support vector machine (SVM) on the dimension-reduced data, and then, the
initial detection result is further optimized with the extended random walker
(ERW) model so as to improve the detection accuracy of oil spills. Experiments
on airborne hyperspectral oil spill data (HOSD) created by ourselves
demonstrate that the proposed method obtains superior detection performance
with respect to other state-of-the-art detection approaches
Universal Adversarial Defense in Remote Sensing Based on Pre-trained Denoising Diffusion Models
Deep neural networks (DNNs) have achieved tremendous success in many remote
sensing (RS) applications, in which DNNs are vulnerable to adversarial
perturbations. Unfortunately, current adversarial defense approaches in RS
studies usually suffer from performance fluctuation and unnecessary re-training
costs due to the need for prior knowledge of the adversarial perturbations
among RS data. To circumvent these challenges, we propose a universal
adversarial defense approach in RS imagery (UAD-RS) using pre-trained diffusion
models to defend the common DNNs against multiple unknown adversarial attacks.
Specifically, the generative diffusion models are first pre-trained on
different RS datasets to learn generalized representations in various data
domains. After that, a universal adversarial purification framework is
developed using the forward and reverse process of the pre-trained diffusion
models to purify the perturbations from adversarial samples. Furthermore, an
adaptive noise level selection (ANLS) mechanism is built to capture the optimal
noise level of the diffusion model that can achieve the best purification
results closest to the clean samples according to their Frechet Inception
Distance (FID) in deep feature space. As a result, only a single pre-trained
diffusion model is needed for the universal purification of adversarial samples
on each dataset, which significantly alleviates the re-training efforts and
maintains high performance without prior knowledge of the adversarial
perturbations. Experiments on four heterogeneous RS datasets regarding scene
classification and semantic segmentation verify that UAD-RS outperforms
state-of-the-art adversarial purification approaches with a universal defense
against seven commonly existing adversarial perturbations. Codes and the
pre-trained models are available online (https://github.com/EricYu97/UAD-RS).Comment: Added the GitHub link to the abstrac
- …