6,004 research outputs found
VSSA-NET: Vertical Spatial Sequence Attention Network for Traffic Sign Detection
Although traffic sign detection has been studied for years and great progress
has been made with the rise of deep learning technique, there are still many
problems remaining to be addressed. For complicated real-world traffic scenes,
there are two main challenges. Firstly, traffic signs are usually small size
objects, which makes it more difficult to detect than large ones; Secondly, it
is hard to distinguish false targets which resemble real traffic signs in
complex street scenes without context information. To handle these problems, we
propose a novel end-to-end deep learning method for traffic sign detection in
complex environments. Our contributions are as follows: 1) We propose a
multi-resolution feature fusion network architecture which exploits densely
connected deconvolution layers with skip connections, and can learn more
effective features for the small size object; 2) We frame the traffic sign
detection as a spatial sequence classification and regression task, and propose
a vertical spatial sequence attention (VSSA) module to gain more context
information for better detection performance. To comprehensively evaluate the
proposed method, we do experiments on several traffic sign datasets as well as
the general object detection dataset and the results have shown the
effectiveness of our proposed method
S4ND: Single-Shot Single-Scale Lung Nodule Detection
The state of the art lung nodule detection studies rely on computationally
expensive multi-stage frameworks to detect nodules from CT scans. To address
this computational challenge and provide better performance, in this paper we
propose S4ND, a new deep learning based method for lung nodule detection. Our
approach uses a single feed forward pass of a single network for detection and
provides better performance when compared to the current literature. The whole
detection pipeline is designed as a single Convolutional Neural Network
(CNN) with dense connections, trained in an end-to-end manner. S4ND does not
require any further post-processing or user guidance to refine detection
results. Experimentally, we compared our network with the current
state-of-the-art object detection network (SSD) in computer vision as well as
the state-of-the-art published method for lung nodule detection (3D DCNN). We
used publically available CT scans from LUNA challenge dataset and showed
that the proposed method outperforms the current literature both in terms of
efficiency and accuracy by achieving an average FROC-score of . We also
provide an in-depth analysis of our proposed network to shed light on the
unclear paradigms of tiny object detection.Comment: Accepted for publication at MICCAI 2018 (21st International
Conference on Medical Image Computing and Computer Assisted Intervention
DC-SPP-YOLO: Dense Connection and Spatial Pyramid Pooling Based YOLO for Object Detection
Although YOLOv2 approach is extremely fast on object detection; its backbone
network has the low ability on feature extraction and fails to make full use of
multi-scale local region features, which restricts the improvement of object
detection accuracy. Therefore, this paper proposed a DC-SPP-YOLO (Dense
Connection and Spatial Pyramid Pooling Based YOLO) approach for ameliorating
the object detection accuracy of YOLOv2. Specifically, the dense connection of
convolution layers is employed in the backbone network of YOLOv2 to strengthen
the feature extraction and alleviate the vanishing-gradient problem. Moreover,
an improved spatial pyramid pooling is introduced to pool and concatenate the
multi-scale local region features, so that the network can learn the object
features more comprehensively. The DC-SPP-YOLO model is established and trained
based on a new loss function composed of mean square error and cross entropy,
and the object detection is realized. Experiments demonstrate that the mAP
(mean Average Precision) of DC-SPP-YOLO proposed on PASCAL VOC datasets and
UA-DETRAC datasets is higher than that of YOLOv2; the object detection accuracy
of DC-SPP-YOLO is superior to YOLOv2 by strengthening feature extraction and
using the multi-scale local region features.Comment: 23 pages, 9 figures, 9 table
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
StairNet: Top-Down Semantic Aggregation for Accurate One Shot Detection
One-stage object detectors such as SSD or YOLO already have shown promising
accuracy with small memory footprint and fast speed. However, it is widely
recognized that one-stage detectors have difficulty in detecting small objects
while they are competitive with two-stage methods on large objects. In this
paper, we investigate how to alleviate this problem starting from the SSD
framework. Due to their pyramidal design, the lower layer that is responsible
for small objects lacks strong semantics(e.g contextual information). We
address this problem by introducing a feature combining module that spreads out
the strong semantics in a top-down manner. Our final model StairNet detector
unifies the multi-scale representations and semantic distribution effectively.
Experiments on PASCAL VOC 2007 and PASCAL VOC 2012 datasets demonstrate that
StairNet significantly improves the weakness of SSD and outperforms the other
state-of-the-art one-stage detectors
- …