4,127 research outputs found
Exploiting Deep Features for Remote Sensing Image Retrieval: A Systematic Investigation
Remote sensing (RS) image retrieval is of great significant for geological
information mining. Over the past two decades, a large amount of research on
this task has been carried out, which mainly focuses on the following three
core issues: feature extraction, similarity metric and relevance feedback. Due
to the complexity and multiformity of ground objects in high-resolution remote
sensing (HRRS) images, there is still room for improvement in the current
retrieval approaches. In this paper, we analyze the three core issues of RS
image retrieval and provide a comprehensive review on existing methods.
Furthermore, for the goal to advance the state-of-the-art in HRRS image
retrieval, we focus on the feature extraction issue and delve how to use
powerful deep representations to address this task. We conduct systematic
investigation on evaluating correlative factors that may affect the performance
of deep features. By optimizing each factor, we acquire remarkable retrieval
results on publicly available HRRS datasets. Finally, we explain the
experimental phenomenon in detail and draw conclusions according to our
analysis. Our work can serve as a guiding role for the research of
content-based RS image retrieval
Solar Power Plant Detection on Multi-Spectral Satellite Imagery using Weakly-Supervised CNN with Feedback Features and m-PCNN Fusion
Most of the traditional convolutional neural networks (CNNs) implements
bottom-up approach (feed-forward) for image classifications. However, many
scientific studies demonstrate that visual perception in primates rely on both
bottom-up and top-down connections. Therefore, in this work, we propose a CNN
network with feedback structure for Solar power plant detection on
middle-resolution satellite images. To express the strength of the top-down
connections, we introduce feedback CNN network (FB-Net) to a baseline CNN model
used for solar power plant classification on multi-spectral satellite data.
Moreover, we introduce a method to improve class activation mapping (CAM) to
our FB-Net, which takes advantage of multi-channel pulse coupled neural network
(m-PCNN) for weakly-supervised localization of the solar power plants from the
features of proposed FB-Net. For the proposed FB-Net CAM with m-PCNN,
experimental results demonstrated promising results on both solar-power plant
image classification and detection task.Comment: 9 pages, 9 figures, 4 table
Adaptive Self-Training for Object Detection
Deep learning has emerged as an effective solution for solving the task of
object detection in images but at the cost of requiring large labeled datasets.
To mitigate this cost, semi-supervised object detection methods, which consist
in leveraging abundant unlabeled data, have been proposed and have already
shown impressive results. However, most of these methods require linking a
pseudo-label to a ground-truth object by thresholding. In previous works, this
threshold value is usually determined empirically, which is time consuming, and
only done for a single data distribution. When the domain, and thus the data
distribution, changes, a new and costly parameter search is necessary. In this
work, we introduce our method Adaptive Self-Training for Object Detection
(ASTOD), which is a simple yet effective teacher-student method. ASTOD
determines without cost a threshold value based directly on the ground value of
the score histogram. To improve the quality of the teacher predictions, we also
propose a novel pseudo-labeling procedure. We use different views of the
unlabeled images during the pseudo-labeling step to reduce the number of missed
predictions and thus obtain better candidate labels. Our teacher and our
student are trained separately, and our method can be used in an iterative
fashion by replacing the teacher by the student. On the MS-COCO dataset, our
method consistently performs favorably against state-of-the-art methods that do
not require a threshold parameter, and shows competitive results with methods
that require a parameter sweep search. Additional experiments with respect to a
supervised baseline on the DIOR dataset containing satellite images lead to
similar conclusions, and prove that it is possible to adapt the score threshold
automatically in self-training, regardless of the data distribution.Comment: 10 pages, 4 figures, 5 table
Few-shot Object Detection in Remote Sensing: Lifting the Curse of Incompletely Annotated Novel Objects
Object detection is an essential and fundamental task in computer vision and
satellite image processing. Existing deep learning methods have achieved
impressive performance thanks to the availability of large-scale annotated
datasets. Yet, in real-world applications the availability of labels is
limited. In this context, few-shot object detection (FSOD) has emerged as a
promising direction, which aims at enabling the model to detect novel objects
with only few of them annotated. However, many existing FSOD algorithms
overlook a critical issue: when an input image contains multiple novel objects
and only a subset of them are annotated, the unlabeled objects will be
considered as background during training. This can cause confusions and
severely impact the model's ability to recall novel objects. To address this
issue, we propose a self-training-based FSOD (ST-FSOD) approach, which
incorporates the self-training mechanism into the few-shot fine-tuning process.
ST-FSOD aims to enable the discovery of novel objects that are not annotated,
and take them into account during training. On the one hand, we devise a
two-branch region proposal networks (RPN) to separate the proposal extraction
of base and novel objects, On another hand, we incorporate the student-teacher
mechanism into RPN and the region of interest (RoI) head to include those
highly confident yet unlabeled targets as pseudo labels. Experimental results
demonstrate that our proposed method outperforms the state-of-the-art in
various FSOD settings by a large margin. The codes will be publicly available
at https://github.com/zhu-xlab/ST-FSOD
Advancing Land Cover Mapping in Remote Sensing with Deep Learning
Automatic mapping of land cover in remote sensing data plays an increasingly significant role in several earth observation (EO) applications, such as sustainable development, autonomous agriculture, and urban planning. Due to the complexity of the real ground surface and environment, accurate classification of land cover types is facing many challenges. This thesis provides novel deep learning-based solutions to land cover mapping challenges such as how to deal with intricate objects and imbalanced classes in multi-spectral and high-spatial resolution remote sensing data.
The first work presents a novel model to learn richer multi-scale and global contextual representations in very high-resolution remote sensing images, namely the dense dilated convolutions' merging (DDCM) network. The proposed method is light-weighted, flexible and extendable, so that it can be used as a simple yet effective encoder and decoder module to address different classification and semantic mapping challenges. Intensive experiments on different benchmark remote sensing datasets demonstrate that the proposed method can achieve better performance but consume much fewer computation resources compared with other published methods.
Next, a novel graph model is developed for capturing long-range pixel dependencies in remote sensing images to improve land cover mapping. One key component in the method is the self-constructing graph (SCG) module that can effectively construct global context relations (latent graph structure) without requiring prior knowledge graphs. The proposed SCG-based models achieved competitive performance on different representative remote sensing datasets with faster training and lower computational cost compared to strong baseline models.
The third work introduces a new framework, namely the multi-view self-constructing graph (MSCG) network, to extend the vanilla SCG model to be able to capture multi-view context representations with rotation invariance to achieve improved segmentation performance. Meanwhile, a novel adaptive class weighting loss function is developed to alleviate the issue of class imbalance commonly found in EO datasets for semantic segmentation. Experiments on benchmark data demonstrate the proposed framework is computationally efficient and robust to produce improved segmentation results for imbalanced classes.
To address the key challenges in multi-modal land cover mapping of remote sensing data, namely, 'what', 'how' and 'where' to effectively fuse multi-source features and to efficiently learn optimal joint representations of different modalities, the last work presents a compact and scalable multi-modal deep learning framework (MultiModNet) based on two novel modules: the pyramid attention fusion module and the gated fusion unit. The proposed MultiModNet outperforms the strong baselines on two representative remote sensing datasets with fewer parameters and at a lower computational cost. Extensive ablation studies also validate the effectiveness and flexibility of the framework
A review of technical factors to consider when designing neural networks for semantic segmentation of Earth Observation imagery
Semantic segmentation (classification) of Earth Observation imagery is a
crucial task in remote sensing. This paper presents a comprehensive review of
technical factors to consider when designing neural networks for this purpose.
The review focuses on Convolutional Neural Networks (CNNs), Recurrent Neural
Networks (RNNs), Generative Adversarial Networks (GANs), and transformer
models, discussing prominent design patterns for these ANN families and their
implications for semantic segmentation. Common pre-processing techniques for
ensuring optimal data preparation are also covered. These include methods for
image normalization and chipping, as well as strategies for addressing data
imbalance in training samples, and techniques for overcoming limited data,
including augmentation techniques, transfer learning, and domain adaptation. By
encompassing both the technical aspects of neural network design and the
data-related considerations, this review provides researchers and practitioners
with a comprehensive and up-to-date understanding of the factors involved in
designing effective neural networks for semantic segmentation of Earth
Observation imagery.Comment: 145 pages with 32 figure
- …