105 research outputs found
A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community
In recent years, deep learning (DL), a re-branding of neural networks (NNs),
has risen to the top in numerous areas, namely computer vision (CV), speech
recognition, natural language processing, etc. Whereas remote sensing (RS)
possesses a number of unique challenges, primarily related to sensors and
applications, inevitably RS draws from many of the same theories as CV; e.g.,
statistics, fusion, and machine learning, to name a few. This means that the RS
community should be aware of, if not at the leading edge of, of advancements
like DL. Herein, we provide the most comprehensive survey of state-of-the-art
RS DL research. We also review recent new developments in the DL field that can
be used in DL for RS. Namely, we focus on theories, tools and challenges for
the RS community. Specifically, we focus on unsolved challenges and
opportunities as it relates to (i) inadequate data sets, (ii)
human-understandable solutions for modelling physical phenomena, (iii) Big
Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and
learning algorithms for spectral, spatial and temporal data, (vi) transfer
learning, (vii) an improved theoretical understanding of DL systems, (viii)
high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote
Sensin
X-ModalNet: A Semi-Supervised Deep Cross-Modal Network for Classification of Remote Sensing Data
This paper addresses the problem of semi-supervised transfer learning with
limited cross-modality data in remote sensing. A large amount of multi-modal
earth observation images, such as multispectral imagery (MSI) or synthetic
aperture radar (SAR) data, are openly available on a global scale, enabling
parsing global urban scenes through remote sensing imagery. However, their
ability in identifying materials (pixel-wise classification) remains limited,
due to the noisy collection environment and poor discriminative information as
well as limited number of well-annotated training images. To this end, we
propose a novel cross-modal deep-learning framework, called X-ModalNet, with
three well-designed modules: self-adversarial module, interactive learning
module, and label propagation module, by learning to transfer more
discriminative information from a small-scale hyperspectral image (HSI) into
the classification task using a large-scale MSI or SAR data. Significantly,
X-ModalNet generalizes well, owing to propagating labels on an updatable graph
constructed by high-level features on the top of the network, yielding
semi-supervised cross-modality learning. We evaluate X-ModalNet on two
multi-modal remote sensing datasets (HSI-MSI and HSI-SAR) and achieve a
significant improvement in comparison with several state-of-the-art methods
Pol-InSAR-Island - A benchmark dataset for multi-frequency Pol-InSAR data land cover classification
This paper presents Pol-InSAR-Island, the first publicly available multi-frequency Polarimetric Interferometric Synthetic Aperture Radar (Pol-InSAR) dataset labeled with detailed land cover classes, which serves as a challenging benchmark dataset for land cover classification. In recent years, machine learning has become a powerful tool for remote sensing image analysis. While there are numerous large-scale benchmark datasets for training and evaluating machine learning models for the analysis of optical data, the availability of labeled SAR or, more specifically, Pol-InSAR data is very limited. The lack of labeled data for training, as well as for testing and comparing different approaches, hinders the rapid development of machine learning algorithms for Pol-InSAR image analysis. The Pol-InSAR-Island benchmark dataset presented in this paper aims to fill this gap. The dataset consists of Pol-InSAR data acquired in S- and L-band by DLR\u27s airborne F-SAR system over the East Frisian island Baltrum. The interferometric image pairs are the result of a repeat-pass measurement with a time offset of several minutes. The image data are given as 6 × 6 coherency matrices in ground range on a 1 m × 1m grid. Pixel-accurate class labels, consisting of 12 different land cover classes, are generated in a semi-automatic process based on an existing biotope type map and visual interpretation of SAR and optical images. Fixed training and test subsets are defined to ensure the comparability of different approaches trained and tested prospectively on the Pol-InSAR-Island dataset. In addition to the dataset, results of supervised Wishart and Random Forest classifiers that achieve mean Intersection-over-Union scores between 24% and 67% are provided to serve as a baseline for future work. The dataset is provided via KITopenData: https://doi.org/10.35097/170
More Diverse Means Better: Multimodal Deep Learning Meets Remote Sensing Imagery Classification
Classification and identification of the materials lying over or beneath the
Earth's surface have long been a fundamental but challenging research topic in
geoscience and remote sensing (RS) and have garnered a growing concern owing to
the recent advancements of deep learning techniques. Although deep networks
have been successfully applied in single-modality-dominated classification
tasks, yet their performance inevitably meets the bottleneck in complex scenes
that need to be finely classified, due to the limitation of information
diversity. In this work, we provide a baseline solution to the aforementioned
difficulty by developing a general multimodal deep learning (MDL) framework. In
particular, we also investigate a special case of multi-modality learning (MML)
-- cross-modality learning (CML) that exists widely in RS image classification
applications. By focusing on "what", "where", and "how" to fuse, we show
different fusion strategies as well as how to train deep networks and build the
network architecture. Specifically, five fusion architectures are introduced
and developed, further being unified in our MDL framework. More significantly,
our framework is not only limited to pixel-wise classification tasks but also
applicable to spatial information modeling with convolutional neural networks
(CNNs). To validate the effectiveness and superiority of the MDL framework,
extensive experiments related to the settings of MML and CML are conducted on
two different multimodal RS datasets. Furthermore, the codes and datasets will
be available at https://github.com/danfenghong/IEEE_TGRS_MDL-RS, contributing
to the RS community
- …