602 research outputs found
Synthetic Aperture Radar (SAR) Meets Deep Learning
This reprint focuses on the application of the combination of synthetic aperture radars and depth learning technology. It aims to further promote the development of SAR image intelligent interpretation technology. A synthetic aperture radar (SAR) is an important active microwave imaging sensor, whose all-day and all-weather working capacity give it an important place in the remote sensing community. Since the United States launched the first SAR satellite, SAR has received much attention in the remote sensing community, e.g., in geological exploration, topographic mapping, disaster forecast, and traffic monitoring. It is valuable and meaningful, therefore, to study SAR-based remote sensing applications. In recent years, deep learning represented by convolution neural networks has promoted significant progress in the computer vision community, e.g., in face recognition, the driverless field and Internet of things (IoT). Deep learning can enable computational models with multiple processing layers to learn data representations with multiple-level abstractions. This can greatly improve the performance of various applications. This reprint provides a platform for researchers to handle the above significant challenges and present their innovative and cutting-edge research results when applying deep learning to SAR in various manuscript types, e.g., articles, letters, reviews and technical reports
EMC2A-Net: An Efficient Multibranch Cross-channel Attention Network for SAR Target Classification
In recent years, convolutional neural networks (CNNs) have shown great
potential in synthetic aperture radar (SAR) target recognition. SAR images have
a strong sense of granularity and have different scales of texture features,
such as speckle noise, target dominant scatterers and target contours, which
are rarely considered in the traditional CNN model. This paper proposed two
residual blocks, namely EMC2A blocks with multiscale receptive fields(RFs),
based on a multibranch structure and then designed an efficient isotopic
architecture deep CNN (DCNN), EMC2A-Net. EMC2A blocks utilize parallel dilated
convolution with different dilation rates, which can effectively capture
multiscale context features without significantly increasing the computational
burden. To further improve the efficiency of multiscale feature fusion, this
paper proposed a multiscale feature cross-channel attention module, namely the
EMC2A module, adopting a local multiscale feature interaction strategy without
dimensionality reduction. This strategy adaptively adjusts the weights of each
channel through efficient one-dimensional (1D)-circular convolution and sigmoid
function to guide attention at the global channel wise level. The comparative
results on the MSTAR dataset show that EMC2A-Net outperforms the existing
available models of the same type and has relatively lightweight network
structure. The ablation experiment results show that the EMC2A module
significantly improves the performance of the model by using only a few
parameters and appropriate cross-channel interactions.Comment: 15 pages, 9 figures, Submitted to IEEE Transactions on Geoscience and
Remote Sensing, 202
TAI-SARNET: Deep Transferred Atrous-Inception CNN for Small Samples SAR ATR
Since Synthetic Aperture Radar (SAR) targets are full of coherent speckle noise, the traditional deep learning models are difficult to effectively extract key features of the targets and share high computational complexity. To solve the problem, an effective lightweight Convolutional Neural Network (CNN) model incorporating transfer learning is proposed for better handling SAR targets recognition tasks. In this work, firstly we propose the Atrous-Inception module, which combines both atrous convolution and inception module to obtain rich global receptive fields, while strictly controlling the parameter amount and realizing lightweight network architecture. Secondly, the transfer learning strategy is used to effectively transfer the prior knowledge of the optical, non-optical, hybrid optical and non-optical domains to the SAR target recognition tasks, thereby improving the model\u2019s recognition performance on small sample SAR target datasets. Finally, the model constructed in this paper is verified to be 97.97% on ten types of MSTAR datasets under standard operating conditions, reaching a mainstream target recognition rate. Meanwhile, the method presented in this paper shows strong robustness and generalization performance on a small number of randomly sampled SAR target datasets
High-order Spatial Interactions Enhanced Lightweight Model for Optical Remote Sensing Image-based Small Ship Detection
Accurate and reliable optical remote sensing image-based small-ship detection
is crucial for maritime surveillance systems, but existing methods often
struggle with balancing detection performance and computational complexity. In
this paper, we propose a novel lightweight framework called
\textit{HSI-ShipDetectionNet} that is based on high-order spatial interactions
and is suitable for deployment on resource-limited platforms, such as
satellites and unmanned aerial vehicles. HSI-ShipDetectionNet includes a
prediction branch specifically for tiny ships and a lightweight hybrid
attention block for reduced complexity. Additionally, the use of a high-order
spatial interactions module improves advanced feature understanding and
modeling ability. Our model is evaluated using the public Kaggle marine ship
detection dataset and compared with multiple state-of-the-art models including
small object detection models, lightweight detection models, and ship detection
models. The results show that HSI-ShipDetectionNet outperforms the other models
in terms of recall, and mean average precision (mAP) while being lightweight
and suitable for deployment on resource-limited platforms
Colab NAS: Obtaining lightweight task-specific convolutional neural networks following Occam's razor
The current trend of applying transfer learning from convolutional neural
networks (CNNs) trained on large datasets can be an overkill when the target
application is a custom and delimited problem, with enough data to train a
network from scratch. On the other hand, the training of custom and lighter
CNNs requires expertise, in the from-scratch case, and or high-end resources,
as in the case of hardware-aware neural architecture search (HW NAS), limiting
access to the technology by non-habitual NN developers.
For this reason, we present ColabNAS, an affordable HW NAS technique for
producing lightweight task-specific CNNs. Its novel derivative-free search
strategy, inspired by Occam's razor, allows to obtain state-of-the-art results
on the Visual Wake Word dataset, a standard TinyML benchmark, in just 3.1 GPU
hours using free online GPU services such as Google Colaboratory and Kaggle
Kernel
RingMo-lite: A Remote Sensing Multi-task Lightweight Network with CNN-Transformer Hybrid Framework
In recent years, remote sensing (RS) vision foundation models such as RingMo
have emerged and achieved excellent performance in various downstream tasks.
However, the high demand for computing resources limits the application of
these models on edge devices. It is necessary to design a more lightweight
foundation model to support on-orbit RS image interpretation. Existing methods
face challenges in achieving lightweight solutions while retaining
generalization in RS image interpretation. This is due to the complex high and
low-frequency spectral components in RS images, which make traditional single
CNN or Vision Transformer methods unsuitable for the task. Therefore, this
paper proposes RingMo-lite, an RS multi-task lightweight network with a
CNN-Transformer hybrid framework, which effectively exploits the
frequency-domain properties of RS to optimize the interpretation process. It is
combined by the Transformer module as a low-pass filter to extract global
features of RS images through a dual-branch structure, and the CNN module as a
stacked high-pass filter to extract fine-grained details effectively.
Furthermore, in the pretraining stage, the designed frequency-domain masked
image modeling (FD-MIM) combines each image patch's high-frequency and
low-frequency characteristics, effectively capturing the latent feature
representation in RS data. As shown in Fig. 1, compared with RingMo, the
proposed RingMo-lite reduces the parameters over 60% in various RS image
interpretation tasks, the average accuracy drops by less than 2% in most of the
scenes and achieves SOTA performance compared to models of the similar size. In
addition, our work will be integrated into the MindSpore computing platform in
the near future
Remote Sensing Object Detection Meets Deep Learning: A Meta-review of Challenges and Advances
Remote sensing object detection (RSOD), one of the most fundamental and
challenging tasks in the remote sensing field, has received longstanding
attention. In recent years, deep learning techniques have demonstrated robust
feature representation capabilities and led to a big leap in the development of
RSOD techniques. In this era of rapid technical evolution, this review aims to
present a comprehensive review of the recent achievements in deep learning
based RSOD methods. More than 300 papers are covered in this review. We
identify five main challenges in RSOD, including multi-scale object detection,
rotated object detection, weak object detection, tiny object detection, and
object detection with limited supervision, and systematically review the
corresponding methods developed in a hierarchical division manner. We also
review the widely used benchmark datasets and evaluation metrics within the
field of RSOD, as well as the application scenarios for RSOD. Future research
directions are provided for further promoting the research in RSOD.Comment: Accepted with IEEE Geoscience and Remote Sensing Magazine. More than
300 papers relevant to the RSOD filed were reviewed in this surve
- …