5,955 research outputs found
VSSA-NET: Vertical Spatial Sequence Attention Network for Traffic Sign Detection
Although traffic sign detection has been studied for years and great progress
has been made with the rise of deep learning technique, there are still many
problems remaining to be addressed. For complicated real-world traffic scenes,
there are two main challenges. Firstly, traffic signs are usually small size
objects, which makes it more difficult to detect than large ones; Secondly, it
is hard to distinguish false targets which resemble real traffic signs in
complex street scenes without context information. To handle these problems, we
propose a novel end-to-end deep learning method for traffic sign detection in
complex environments. Our contributions are as follows: 1) We propose a
multi-resolution feature fusion network architecture which exploits densely
connected deconvolution layers with skip connections, and can learn more
effective features for the small size object; 2) We frame the traffic sign
detection as a spatial sequence classification and regression task, and propose
a vertical spatial sequence attention (VSSA) module to gain more context
information for better detection performance. To comprehensively evaluate the
proposed method, we do experiments on several traffic sign datasets as well as
the general object detection dataset and the results have shown the
effectiveness of our proposed method
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
Deep supervised learning using local errors
Error backpropagation is a highly effective mechanism for learning
high-quality hierarchical features in deep networks. Updating the features or
weights in one layer, however, requires waiting for the propagation of error
signals from higher layers. Learning using delayed and non-local errors makes
it hard to reconcile backpropagation with the learning mechanisms observed in
biological neural networks as it requires the neurons to maintain a memory of
the input long enough until the higher-layer errors arrive. In this paper, we
propose an alternative learning mechanism where errors are generated locally in
each layer using fixed, random auxiliary classifiers. Lower layers could thus
be trained independently of higher layers and training could either proceed
layer by layer, or simultaneously in all layers using local error information.
We address biological plausibility concerns such as weight symmetry
requirements and show that the proposed learning mechanism based on fixed,
broad, and random tuning of each neuron to the classification categories
outperforms the biologically-motivated feedback alignment learning technique on
the MNIST, CIFAR10, and SVHN datasets, approaching the performance of standard
backpropagation. Our approach highlights a potential biological mechanism for
the supervised, or task-dependent, learning of feature hierarchies. In
addition, we show that it is well suited for learning deep networks in custom
hardware where it can drastically reduce memory traffic and data communication
overheads
Perceptual Generative Adversarial Networks for Small Object Detection
Detecting small objects is notoriously challenging due to their low
resolution and noisy representation. Existing object detection pipelines
usually detect small objects through learning representations of all the
objects at multiple scales. However, the performance gain of such ad hoc
architectures is usually limited to pay off the computational cost. In this
work, we address the small object detection problem by developing a single
architecture that internally lifts representations of small objects to
"super-resolved" ones, achieving similar characteristics as large objects and
thus more discriminative for detection. For this purpose, we propose a new
Perceptual Generative Adversarial Network (Perceptual GAN) model that improves
small object detection through narrowing representation difference of small
objects from the large ones. Specifically, its generator learns to transfer
perceived poor representations of the small objects to super-resolved ones that
are similar enough to real large objects to fool a competing discriminator.
Meanwhile its discriminator competes with the generator to identify the
generated representation and imposes an additional perceptual requirement -
generated representations of small objects must be beneficial for detection
purpose - on the generator. Extensive evaluations on the challenging
Tsinghua-Tencent 100K and the Caltech benchmark well demonstrate the
superiority of Perceptual GAN in detecting small objects, including traffic
signs and pedestrians, over well-established state-of-the-arts
- …