396 research outputs found
A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community
In recent years, deep learning (DL), a re-branding of neural networks (NNs),
has risen to the top in numerous areas, namely computer vision (CV), speech
recognition, natural language processing, etc. Whereas remote sensing (RS)
possesses a number of unique challenges, primarily related to sensors and
applications, inevitably RS draws from many of the same theories as CV; e.g.,
statistics, fusion, and machine learning, to name a few. This means that the RS
community should be aware of, if not at the leading edge of, of advancements
like DL. Herein, we provide the most comprehensive survey of state-of-the-art
RS DL research. We also review recent new developments in the DL field that can
be used in DL for RS. Namely, we focus on theories, tools and challenges for
the RS community. Specifically, we focus on unsolved challenges and
opportunities as it relates to (i) inadequate data sets, (ii)
human-understandable solutions for modelling physical phenomena, (iii) Big
Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and
learning algorithms for spectral, spatial and temporal data, (vi) transfer
learning, (vii) an improved theoretical understanding of DL systems, (viii)
high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote
Sensin
Hierarchical Disentanglement-Alignment Network for Robust SAR Vehicle Recognition
Vehicle recognition is a fundamental problem in SAR image interpretation.
However, robustly recognizing vehicle targets is a challenging task in SAR due
to the large intraclass variations and small interclass variations.
Additionally, the lack of large datasets further complicates the task. Inspired
by the analysis of target signature variations and deep learning
explainability, this paper proposes a novel domain alignment framework named
the Hierarchical Disentanglement-Alignment Network (HDANet) to achieve
robustness under various operating conditions. Concisely, HDANet integrates
feature disentanglement and alignment into a unified framework with three
modules: domain data generation, multitask-assisted mask disentanglement, and
domain alignment of target features. The first module generates diverse data
for alignment, and three simple but effective data augmentation methods are
designed to simulate target signature variations. The second module
disentangles the target features from background clutter using the
multitask-assisted mask to prevent clutter from interfering with subsequent
alignment. The third module employs a contrastive loss for domain alignment to
extract robust target features from generated diverse data and disentangled
features. Lastly, the proposed method demonstrates impressive robustness across
nine operating conditions in the MSTAR dataset, and extensive qualitative and
quantitative analyses validate the effectiveness of our framework
Deep learning in remote sensing: a review
Standing at the paradigm shift towards data-intensive science, machine
learning techniques are becoming increasingly important. In particular, as a
major breakthrough in the field, deep learning has proven as an extremely
powerful tool in many fields. Shall we embrace deep learning as the key to all?
Or, should we resist a 'black-box' solution? There are controversial opinions
in the remote sensing community. In this article, we analyze the challenges of
using deep learning for remote sensing data analysis, review the recent
advances, and provide resources to make deep learning in remote sensing
ridiculously simple to start with. More importantly, we advocate remote sensing
scientists to bring their expertise into deep learning, and use it as an
implicit general model to tackle unprecedented large-scale influential
challenges, such as climate change and urbanization.Comment: Accepted for publication IEEE Geoscience and Remote Sensing Magazin
Near-field Perception for Low-Speed Vehicle Automation using Surround-view Fisheye Cameras
Cameras are the primary sensor in automated driving systems. They provide
high information density and are optimal for detecting road infrastructure cues
laid out for human vision. Surround-view camera systems typically comprise of
four fisheye cameras with 190{\deg}+ field of view covering the entire
360{\deg} around the vehicle focused on near-field sensing. They are the
principal sensors for low-speed, high accuracy, and close-range sensing
applications, such as automated parking, traffic jam assistance, and low-speed
emergency braking. In this work, we provide a detailed survey of such vision
systems, setting up the survey in the context of an architecture that can be
decomposed into four modular components namely Recognition, Reconstruction,
Relocalization, and Reorganization. We jointly call this the 4R Architecture.
We discuss how each component accomplishes a specific aspect and provide a
positional argument that they can be synergized to form a complete perception
system for low-speed automation. We support this argument by presenting results
from previous works and by presenting architecture proposals for such a system.
Qualitative results are presented in the video at https://youtu.be/ae8bCOF77uY.Comment: Accepted for publication at IEEE Transactions on Intelligent
Transportation System
Recent Advances of Local Mechanisms in Computer Vision: A Survey and Outlook of Recent Work
Inspired by the fact that human brains can emphasize discriminative parts of
the input and suppress irrelevant ones, substantial local mechanisms have been
designed to boost the development of computer vision. They can not only focus
on target parts to learn discriminative local representations, but also process
information selectively to improve the efficiency. In terms of application
scenarios and paradigms, local mechanisms have different characteristics. In
this survey, we provide a systematic review of local mechanisms for various
computer vision tasks and approaches, including fine-grained visual
recognition, person re-identification, few-/zero-shot learning, multi-modal
learning, self-supervised learning, Vision Transformers, and so on.
Categorization of local mechanisms in each field is summarized. Then,
advantages and disadvantages for every category are analyzed deeply, leaving
room for exploration. Finally, future research directions about local
mechanisms have also been discussed that may benefit future works. To the best
our knowledge, this is the first survey about local mechanisms on computer
vision. We hope that this survey can shed light on future research in the
computer vision field
A review of technical factors to consider when designing neural networks for semantic segmentation of Earth Observation imagery
Semantic segmentation (classification) of Earth Observation imagery is a
crucial task in remote sensing. This paper presents a comprehensive review of
technical factors to consider when designing neural networks for this purpose.
The review focuses on Convolutional Neural Networks (CNNs), Recurrent Neural
Networks (RNNs), Generative Adversarial Networks (GANs), and transformer
models, discussing prominent design patterns for these ANN families and their
implications for semantic segmentation. Common pre-processing techniques for
ensuring optimal data preparation are also covered. These include methods for
image normalization and chipping, as well as strategies for addressing data
imbalance in training samples, and techniques for overcoming limited data,
including augmentation techniques, transfer learning, and domain adaptation. By
encompassing both the technical aspects of neural network design and the
data-related considerations, this review provides researchers and practitioners
with a comprehensive and up-to-date understanding of the factors involved in
designing effective neural networks for semantic segmentation of Earth
Observation imagery.Comment: 145 pages with 32 figure
Deep Learning for Time Series Classification and Extrinsic Regression: A Current Survey
Time Series Classification and Extrinsic Regression are important and
challenging machine learning tasks. Deep learning has revolutionized natural
language processing and computer vision and holds great promise in other fields
such as time series analysis where the relevant features must often be
abstracted from the raw data but are not known a priori. This paper surveys the
current state of the art in the fast-moving field of deep learning for time
series classification and extrinsic regression. We review different network
architectures and training methods used for these tasks and discuss the
challenges and opportunities when applying deep learning to time series data.
We also summarize two critical applications of time series classification and
extrinsic regression, human activity recognition and satellite earth
observation
- …