294 research outputs found
A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community
In recent years, deep learning (DL), a re-branding of neural networks (NNs),
has risen to the top in numerous areas, namely computer vision (CV), speech
recognition, natural language processing, etc. Whereas remote sensing (RS)
possesses a number of unique challenges, primarily related to sensors and
applications, inevitably RS draws from many of the same theories as CV; e.g.,
statistics, fusion, and machine learning, to name a few. This means that the RS
community should be aware of, if not at the leading edge of, of advancements
like DL. Herein, we provide the most comprehensive survey of state-of-the-art
RS DL research. We also review recent new developments in the DL field that can
be used in DL for RS. Namely, we focus on theories, tools and challenges for
the RS community. Specifically, we focus on unsolved challenges and
opportunities as it relates to (i) inadequate data sets, (ii)
human-understandable solutions for modelling physical phenomena, (iii) Big
Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and
learning algorithms for spectral, spatial and temporal data, (vi) transfer
learning, (vii) an improved theoretical understanding of DL systems, (viii)
high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote
Sensin
Sparse Signal Models for Data Augmentation in Deep Learning ATR
Automatic Target Recognition (ATR) algorithms classify a given Synthetic
Aperture Radar (SAR) image into one of the known target classes using a set of
training images available for each class. Recently, learning methods have shown
to achieve state-of-the-art classification accuracy if abundant training data
is available, sampled uniformly over the classes, and their poses. In this
paper, we consider the task of ATR with a limited set of training images. We
propose a data augmentation approach to incorporate domain knowledge and
improve the generalization power of a data-intensive learning algorithm, such
as a Convolutional neural network (CNN). The proposed data augmentation method
employs a limited persistence sparse modeling approach, capitalizing on
commonly observed characteristics of wide-angle synthetic aperture radar (SAR)
imagery. Specifically, we exploit the sparsity of the scattering centers in the
spatial domain and the smoothly-varying structure of the scattering
coefficients in the azimuthal domain to solve the ill-posed problem of
over-parametrized model fitting. Using this estimated model, we synthesize new
images at poses and sub-pixel translations not available in the given data to
augment CNN's training data. The experimental results show that for the
training data starved region, the proposed method provides a significant gain
in the resulting ATR algorithm's generalization performance.Comment: 12 pages, 5 figures, to be submitted to IEEE Transactions on
Geoscience and Remote Sensin
Deep learning in remote sensing: a review
Standing at the paradigm shift towards data-intensive science, machine
learning techniques are becoming increasingly important. In particular, as a
major breakthrough in the field, deep learning has proven as an extremely
powerful tool in many fields. Shall we embrace deep learning as the key to all?
Or, should we resist a 'black-box' solution? There are controversial opinions
in the remote sensing community. In this article, we analyze the challenges of
using deep learning for remote sensing data analysis, review the recent
advances, and provide resources to make deep learning in remote sensing
ridiculously simple to start with. More importantly, we advocate remote sensing
scientists to bring their expertise into deep learning, and use it as an
implicit general model to tackle unprecedented large-scale influential
challenges, such as climate change and urbanization.Comment: Accepted for publication IEEE Geoscience and Remote Sensing Magazin
X-ModalNet: A Semi-Supervised Deep Cross-Modal Network for Classification of Remote Sensing Data
This paper addresses the problem of semi-supervised transfer learning with
limited cross-modality data in remote sensing. A large amount of multi-modal
earth observation images, such as multispectral imagery (MSI) or synthetic
aperture radar (SAR) data, are openly available on a global scale, enabling
parsing global urban scenes through remote sensing imagery. However, their
ability in identifying materials (pixel-wise classification) remains limited,
due to the noisy collection environment and poor discriminative information as
well as limited number of well-annotated training images. To this end, we
propose a novel cross-modal deep-learning framework, called X-ModalNet, with
three well-designed modules: self-adversarial module, interactive learning
module, and label propagation module, by learning to transfer more
discriminative information from a small-scale hyperspectral image (HSI) into
the classification task using a large-scale MSI or SAR data. Significantly,
X-ModalNet generalizes well, owing to propagating labels on an updatable graph
constructed by high-level features on the top of the network, yielding
semi-supervised cross-modality learning. We evaluate X-ModalNet on two
multi-modal remote sensing datasets (HSI-MSI and HSI-SAR) and achieve a
significant improvement in comparison with several state-of-the-art methods
More Diverse Means Better: Multimodal Deep Learning Meets Remote Sensing Imagery Classification
Classification and identification of the materials lying over or beneath the
Earth's surface have long been a fundamental but challenging research topic in
geoscience and remote sensing (RS) and have garnered a growing concern owing to
the recent advancements of deep learning techniques. Although deep networks
have been successfully applied in single-modality-dominated classification
tasks, yet their performance inevitably meets the bottleneck in complex scenes
that need to be finely classified, due to the limitation of information
diversity. In this work, we provide a baseline solution to the aforementioned
difficulty by developing a general multimodal deep learning (MDL) framework. In
particular, we also investigate a special case of multi-modality learning (MML)
-- cross-modality learning (CML) that exists widely in RS image classification
applications. By focusing on "what", "where", and "how" to fuse, we show
different fusion strategies as well as how to train deep networks and build the
network architecture. Specifically, five fusion architectures are introduced
and developed, further being unified in our MDL framework. More significantly,
our framework is not only limited to pixel-wise classification tasks but also
applicable to spatial information modeling with convolutional neural networks
(CNNs). To validate the effectiveness and superiority of the MDL framework,
extensive experiments related to the settings of MML and CML are conducted on
two different multimodal RS datasets. Furthermore, the codes and datasets will
be available at https://github.com/danfenghong/IEEE_TGRS_MDL-RS, contributing
to the RS community
Learning transformer-based heterogeneously salient graph representation for multimodal fusion classification of hyperspectral image and LiDAR data
Data collected by different modalities can provide a wealth of complementary
information, such as hyperspectral image (HSI) to offer rich spectral-spatial
properties, synthetic aperture radar (SAR) to provide structural information
about the Earth's surface, and light detection and ranging (LiDAR) to cover
altitude information about ground elevation. Therefore, a natural idea is to
combine multimodal images for refined and accurate land-cover interpretation.
Although many efforts have been attempted to achieve multi-source remote
sensing image classification, there are still three issues as follows: 1)
indiscriminate feature representation without sufficiently considering modal
heterogeneity, 2) abundant features and complex computations associated with
modeling long-range dependencies, and 3) overfitting phenomenon caused by
sparsely labeled samples. To overcome the above barriers, a transformer-based
heterogeneously salient graph representation (THSGR) approach is proposed in
this paper. First, a multimodal heterogeneous graph encoder is presented to
encode distinctively non-Euclidean structural features from heterogeneous data.
Then, a self-attention-free multi-convolutional modulator is designed for
effective and efficient long-term dependency modeling. Finally, a mean forward
is put forward in order to avoid overfitting. Based on the above structures,
the proposed model is able to break through modal gaps to obtain differentiated
graph representation with competitive time cost, even for a small fraction of
training samples. Experiments and analyses on three benchmark datasets with
various state-of-the-art (SOTA) methods show the performance of the proposed
approach
PRECONDITIONING AND THE APPLICATION OF CONVOLUTIONAL NEURAL NETWORKS TO CLASSIFY MOVING TARGETS IN SAR IMAGERY
Synthetic Aperture Radar (SAR) is a principle that uses transmitted pulses that store and combine scene echoes to build an image that represents the scene reflectivity. SAR systems can be found on a wide variety of platforms to include satellites, aircraft, and more recently, unmanned platforms like the Global Hawk unmanned aerial vehicle. The next step is to process, analyze and classify the SAR data. The use of a convolutional neural network (CNN) to analyze SAR imagery is a viable method to achieve Automatic Target Recognition (ATR) in military applications. The CNN is an artificial neural network that uses convolutional layers to detect certain features in an image. These features correspond to a target of interest and train the CNN to recognize and classify future images. Moving targets present a major challenge to current SAR ATR methods due to the “smearing” effect in the image. Past research has shown that the combination of autofocus techniques and proper training with moving targets improves the accuracy of the CNN at target recognition. The current research includes improvement of the CNN algorithm and preconditioning techniques, as well as a deeper analysis of moving targets with complex motion such as changes to roll, pitch or yaw. The CNN algorithm was developed and verified using computer simulation.Lieutenant, United States NavyApproved for public release. Distribution is unlimited
- …