4 research outputs found
DC-SPP-YOLO: Dense Connection and Spatial Pyramid Pooling Based YOLO for Object Detection
Although YOLOv2 approach is extremely fast on object detection; its backbone
network has the low ability on feature extraction and fails to make full use of
multi-scale local region features, which restricts the improvement of object
detection accuracy. Therefore, this paper proposed a DC-SPP-YOLO (Dense
Connection and Spatial Pyramid Pooling Based YOLO) approach for ameliorating
the object detection accuracy of YOLOv2. Specifically, the dense connection of
convolution layers is employed in the backbone network of YOLOv2 to strengthen
the feature extraction and alleviate the vanishing-gradient problem. Moreover,
an improved spatial pyramid pooling is introduced to pool and concatenate the
multi-scale local region features, so that the network can learn the object
features more comprehensively. The DC-SPP-YOLO model is established and trained
based on a new loss function composed of mean square error and cross entropy,
and the object detection is realized. Experiments demonstrate that the mAP
(mean Average Precision) of DC-SPP-YOLO proposed on PASCAL VOC datasets and
UA-DETRAC datasets is higher than that of YOLOv2; the object detection accuracy
of DC-SPP-YOLO is superior to YOLOv2 by strengthening feature extraction and
using the multi-scale local region features.Comment: 23 pages, 9 figures, 9 table
DeRPN: Taking a further step toward more general object detection
Most current detection methods have adopted anchor boxes as regression
references. However, the detection performance is sensitive to the setting of
the anchor boxes. A proper setting of anchor boxes may vary significantly
across different datasets, which severely limits the universality of the
detectors. To improve the adaptivity of the detectors, in this paper, we
present a novel dimension-decomposition region proposal network (DeRPN) that
can perfectly displace the traditional Region Proposal Network (RPN). DeRPN
utilizes an anchor string mechanism to independently match object widths and
heights, which is conducive to treating variant object shapes. In addition, a
novel scale-sensitive loss is designed to address the imbalanced loss
computations of different scaled objects, which can avoid the small objects
being overwhelmed by larger ones. Comprehensive experiments conducted on both
general object detection datasets (Pascal VOC 2007, 2012 and MS COCO) and scene
text detection datasets (ICDAR 2013 and COCO-Text) all prove that our DeRPN can
significantly outperform RPN. It is worth mentioning that the proposed DeRPN
can be employed directly on different models, tasks, and datasets without any
modifications of hyperparameters or specialized optimization, which further
demonstrates its adaptivity. The code will be released at
https://github.com/HCIILAB/DeRPN.Comment: 8pages, 4 figures, 6 tables, accepted to appear in AAAI 201
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio