876 research outputs found
A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community
In recent years, deep learning (DL), a re-branding of neural networks (NNs),
has risen to the top in numerous areas, namely computer vision (CV), speech
recognition, natural language processing, etc. Whereas remote sensing (RS)
possesses a number of unique challenges, primarily related to sensors and
applications, inevitably RS draws from many of the same theories as CV; e.g.,
statistics, fusion, and machine learning, to name a few. This means that the RS
community should be aware of, if not at the leading edge of, of advancements
like DL. Herein, we provide the most comprehensive survey of state-of-the-art
RS DL research. We also review recent new developments in the DL field that can
be used in DL for RS. Namely, we focus on theories, tools and challenges for
the RS community. Specifically, we focus on unsolved challenges and
opportunities as it relates to (i) inadequate data sets, (ii)
human-understandable solutions for modelling physical phenomena, (iii) Big
Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and
learning algorithms for spectral, spatial and temporal data, (vi) transfer
learning, (vii) an improved theoretical understanding of DL systems, (viii)
high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote
Sensin
UniHead: Unifying Multi-Perception for Detection Heads
The detection head constitutes a pivotal component within object detectors,
tasked with executing both classification and localization functions.
Regrettably, the commonly used parallel head often lacks omni perceptual
capabilities, such as deformation perception, global perception and cross-task
perception. Despite numerous methods attempt to enhance these abilities from a
single aspect, achieving a comprehensive and unified solution remains a
significant challenge. In response to this challenge, we have developed an
innovative detection head, termed UniHead, to unify three perceptual abilities
simultaneously. More precisely, our approach (1) introduces deformation
perception, enabling the model to adaptively sample object features; (2)
proposes a Dual-axial Aggregation Transformer (DAT) to adeptly model long-range
dependencies, thereby achieving global perception; and (3) devises a Cross-task
Interaction Transformer (CIT) that facilitates interaction between the
classification and localization branches, thus aligning the two tasks. As a
plug-and-play method, the proposed UniHead can be conveniently integrated with
existing detectors. Extensive experiments on the COCO dataset demonstrate that
our UniHead can bring significant improvements to many detectors. For instance,
the UniHead can obtain +2.7 AP gains in RetinaNet, +2.9 AP gains in FreeAnchor,
and +2.1 AP gains in GFL. The code will be publicly available. Code Url:
https://github.com/zht8506/UniHead.Comment: 10 pages, 5 figure
- …