32,506 research outputs found
Small Object Detection Based on Two-Stage Calculation Transformer
Despite the current small object detection task has achieved significant improvements, it still suffers from some problems. For example, it is a challenge to extract small object features because of little information in the scene of small objects, which may lose the original feature information of small object, resulting in poor detection results. To address this problem, this paper proposes a two-stage calculation Transformer (TCT) based small object detection network. Firstly, a two-stage calculation Transformer is embedded in the backbone feature extraction network for feature enhancement. Based on the traditional Transformer values computation, multiple 1D dilated convolutional layer branches with different feature fusions are utilized to implement global self-attention for the purpose of improving the feature representation and information interaction. Secondly, this paper proposes an effective residual connection module to improve the low-efficiency convolution and activation of the current CSPLayer, which helps to advance the information flow and learn more rich contextual details. Finally, this paper proposes a feature fusion and refinement module for fusing multi-scale features and improving the target feature representation capability. Quantitative and qualitative experiments on PASCAL VOC2007+2012 dataset, COCO2017 dataset and TinyPerson dataset show that the proposed algorithm has better ability of target feature extraction and higher detection accuracy for small target detection, compared with YOLOX
Feature-Fused SSD: Fast Detection for Small Objects
Small objects detection is a challenging task in computer vision due to its
limited resolution and information. In order to solve this problem, the
majority of existing methods sacrifice speed for improvement in accuracy. In
this paper, we aim to detect small objects at a fast speed, using the best
object detector Single Shot Multibox Detector (SSD) with respect to
accuracy-vs-speed trade-off as base architecture. We propose a multi-level
feature fusion method for introducing contextual information in SSD, in order
to improve the accuracy for small objects. In detailed fusion operation, we
design two feature fusion modules, concatenation module and element-sum module,
different in the way of adding contextual information. Experimental results
show that these two fusion modules obtain higher mAP on PASCALVOC2007 than
baseline SSD by 1.6 and 1.7 points respectively, especially with 2-3 points
improvement on some smallobjects categories. The testing speed of them is 43
and 40 FPS respectively, superior to the state of the art Deconvolutional
single shot detector (DSSD) by 29.4 and 26.4 FPS. Code is available at
https://github.com/wnzhyee/Feature-Fused-SSD. Keywords: small object detection,
feature fusion, real-time, single shot multi-box detectorComment: Artificial Intelligence;8 pages,8 figure
Deep learning in remote sensing: a review
Standing at the paradigm shift towards data-intensive science, machine
learning techniques are becoming increasingly important. In particular, as a
major breakthrough in the field, deep learning has proven as an extremely
powerful tool in many fields. Shall we embrace deep learning as the key to all?
Or, should we resist a 'black-box' solution? There are controversial opinions
in the remote sensing community. In this article, we analyze the challenges of
using deep learning for remote sensing data analysis, review the recent
advances, and provide resources to make deep learning in remote sensing
ridiculously simple to start with. More importantly, we advocate remote sensing
scientists to bring their expertise into deep learning, and use it as an
implicit general model to tackle unprecedented large-scale influential
challenges, such as climate change and urbanization.Comment: Accepted for publication IEEE Geoscience and Remote Sensing Magazin
Convolutional Networks for Object Category and 3D Pose Estimation from 2D Images
Current CNN-based algorithms for recovering the 3D pose of an object in an
image assume knowledge about both the object category and its 2D localization
in the image. In this paper, we relax one of these constraints and propose to
solve the task of joint object category and 3D pose estimation from an image
assuming known 2D localization. We design a new architecture for this task
composed of a feature network that is shared between subtasks, an object
categorization network built on top of the feature network, and a collection of
category dependent pose regression networks. We also introduce suitable loss
functions and a training method for the new architecture. Experiments on the
challenging PASCAL3D+ dataset show state-of-the-art performance in the joint
categorization and pose estimation task. Moreover, our performance on the joint
task is comparable to the performance of state-of-the-art methods on the
simpler 3D pose estimation with known object category task
- …