2,599 research outputs found
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
A review of technical factors to consider when designing neural networks for semantic segmentation of Earth Observation imagery
Semantic segmentation (classification) of Earth Observation imagery is a
crucial task in remote sensing. This paper presents a comprehensive review of
technical factors to consider when designing neural networks for this purpose.
The review focuses on Convolutional Neural Networks (CNNs), Recurrent Neural
Networks (RNNs), Generative Adversarial Networks (GANs), and transformer
models, discussing prominent design patterns for these ANN families and their
implications for semantic segmentation. Common pre-processing techniques for
ensuring optimal data preparation are also covered. These include methods for
image normalization and chipping, as well as strategies for addressing data
imbalance in training samples, and techniques for overcoming limited data,
including augmentation techniques, transfer learning, and domain adaptation. By
encompassing both the technical aspects of neural network design and the
data-related considerations, this review provides researchers and practitioners
with a comprehensive and up-to-date understanding of the factors involved in
designing effective neural networks for semantic segmentation of Earth
Observation imagery.Comment: 145 pages with 32 figure
Solar Power Plant Detection on Multi-Spectral Satellite Imagery using Weakly-Supervised CNN with Feedback Features and m-PCNN Fusion
Most of the traditional convolutional neural networks (CNNs) implements
bottom-up approach (feed-forward) for image classifications. However, many
scientific studies demonstrate that visual perception in primates rely on both
bottom-up and top-down connections. Therefore, in this work, we propose a CNN
network with feedback structure for Solar power plant detection on
middle-resolution satellite images. To express the strength of the top-down
connections, we introduce feedback CNN network (FB-Net) to a baseline CNN model
used for solar power plant classification on multi-spectral satellite data.
Moreover, we introduce a method to improve class activation mapping (CAM) to
our FB-Net, which takes advantage of multi-channel pulse coupled neural network
(m-PCNN) for weakly-supervised localization of the solar power plants from the
features of proposed FB-Net. For the proposed FB-Net CAM with m-PCNN,
experimental results demonstrated promising results on both solar-power plant
image classification and detection task.Comment: 9 pages, 9 figures, 4 table
Dynamic Convolution Self-Attention Network for Land-Cover Classification in VHR Remote-Sensing Images
The current deep convolutional neural networks for very-high-resolution (VHR) remote-sensing image land-cover classification often suffer from two challenges. First, the feature maps extracted by network encoders based on vanilla convolution usually contain a lot of redundant information, which easily causes misclassification of land cover. Moreover, these encoders usually require a large number of parameters and high computational costs. Second, as remote-sensing images are complex and contain many objects with large-scale variances, it is difficult to use the popular feature fusion modules to improve the representation ability of networks. To address the above issues, we propose a dynamic convolution self-attention network (DCSA-Net) for VHR remote-sensing image land-cover classification. The proposed network has two advantages. On one hand, we designed a lightweight dynamic convolution module (LDCM) by using dynamic convolution and a self-attention mechanism. This module can extract more useful image features than vanilla convolution, avoiding the negative effect of useless feature maps on land-cover classification. On the other hand, we designed a context information aggregation module (CIAM) with a ladder structure to enlarge the receptive field. This module can aggregate multi-scale contexture information from feature maps with different resolutions using a dense connection. Experiment results show that the proposed DCSA-Net is superior to state-of-the-art networks due to higher accuracy of land-cover classification, fewer parameters, and lower computational cost. The source code is made public available.National Natural Science Foundation of China (Program No. 61871259, 62271296, 61861024), in part by Natural Science Basic Research Program of Shaanxi (Program No. 2021JC-47), in part by Key Research and Development Program of Shaanxi (Program No. 2022GY-436, 2021ZDLGY08-07), in part by Natural Science Basic Research Program of Shaanxi (Program No. 2022JQ-634, 2022JQ-018), and in part by Shaanxi Joint Laboratory of Artificial Intelligence (No. 2020SS-03)
- …