1,476 research outputs found
CoupleNet: Coupling Global Structure with Local Parts for Object Detection
The region-based Convolutional Neural Network (CNN) detectors such as Faster
R-CNN or R-FCN have already shown promising results for object detection by
combining the region proposal subnetwork and the classification subnetwork
together. Although R-FCN has achieved higher detection speed while keeping the
detection performance, the global structure information is ignored by the
position-sensitive score maps. To fully explore the local and global
properties, in this paper, we propose a novel fully convolutional network,
named as CoupleNet, to couple the global structure with local parts for object
detection. Specifically, the object proposals obtained by the Region Proposal
Network (RPN) are fed into the the coupling module which consists of two
branches. One branch adopts the position-sensitive RoI (PSRoI) pooling to
capture the local part information of the object, while the other employs the
RoI pooling to encode the global and context information. Next, we design
different coupling strategies and normalization ways to make full use of the
complementary advantages between the global and local branches. Extensive
experiments demonstrate the effectiveness of our approach. We achieve
state-of-the-art results on all three challenging datasets, i.e. a mAP of 82.7%
on VOC07, 80.4% on VOC12, and 34.4% on COCO. Codes will be made publicly
available.Comment: Accepted by ICCV 201
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
Pedestrian Attribute Recognition: A Survey
Recognizing pedestrian attributes is an important task in computer vision
community due to it plays an important role in video surveillance. Many
algorithms has been proposed to handle this task. The goal of this paper is to
review existing works using traditional methods or based on deep learning
networks. Firstly, we introduce the background of pedestrian attributes
recognition (PAR, for short), including the fundamental concepts of pedestrian
attributes and corresponding challenges. Secondly, we introduce existing
benchmarks, including popular datasets and evaluation criterion. Thirdly, we
analyse the concept of multi-task learning and multi-label learning, and also
explain the relations between these two learning algorithms and pedestrian
attribute recognition. We also review some popular network architectures which
have widely applied in the deep learning community. Fourthly, we analyse
popular solutions for this task, such as attributes group, part-based,
\emph{etc}. Fifthly, we shown some applications which takes pedestrian
attributes into consideration and achieve better performance. Finally, we
summarized this paper and give several possible research directions for
pedestrian attributes recognition. The project page of this paper can be found
from the following website:
\url{https://sites.google.com/view/ahu-pedestrianattributes/}.Comment: Check our project page for High Resolution version of this survey:
https://sites.google.com/view/ahu-pedestrianattributes
Augmenting Deep Learning Performance in an Evidential Multiple Classifier System
International audienceThe main objective of this work is to study the applicability of ensemble methods in the context of deep learning with limited amounts of labeled data. We exploit an ensemble of neural networks derived using Monte Carlo dropout, along with an ensemble of SVM classifiers which owes its effectiveness to the hand-crafted features used as inputs and to an active learning procedure. In order to leverage each classifier's respective strengths, we combine them in an evidential framework, which models specifically their imprecision and uncertainty. The application we consider in order to illustrate the interest of our Multiple Classifier System is pedestrian detection in high-density crowds, which is ideally suited for its difficulty, cost of labeling and intrinsic imprecision of annotation data. We show that the fusion resulting from the effective modeling of uncertainty allows for performance improvement, and at the same time, for a deeper interpretation of the result in terms of commitment of the decision
Uncertainty Estimation in One-Stage Object Detection
Environment perception is the task for intelligent vehicles on which all
subsequent steps rely. A key part of perception is to safely detect other road
users such as vehicles, pedestrians, and cyclists. With modern deep learning
techniques huge progress was made over the last years in this field. However
such deep learning based object detection models cannot predict how certain
they are in their predictions, potentially hampering the performance of later
steps such as tracking or sensor fusion. We present a viable approaches to
estimate uncertainty in an one-stage object detector, while improving the
detection performance of the baseline approach. The proposed model is evaluated
on a large scale automotive pedestrian dataset. Experimental results show that
the uncertainty outputted by our system is coupled with detection accuracy and
the occlusion level of pedestrians
- …