55,048 research outputs found
Zero-Annotation Object Detection with Web Knowledge Transfer
Object detection is one of the major problems in computer vision, and has
been extensively studied. Most of the existing detection works rely on
labor-intensive supervision, such as ground truth bounding boxes of objects or
at least image-level annotations. On the contrary, we propose an object
detection method that does not require any form of human annotation on target
tasks, by exploiting freely available web images. In order to facilitate
effective knowledge transfer from web images, we introduce a multi-instance
multi-label domain adaption learning framework with two key innovations. First
of all, we propose an instance-level adversarial domain adaptation network with
attention on foreground objects to transfer the object appearances from web
domain to target domain. Second, to preserve the class-specific semantic
structure of transferred object features, we propose a simultaneous transfer
mechanism to transfer the supervision across domains through pseudo strong
label generation. With our end-to-end framework that simultaneously learns a
weakly supervised detector and transfers knowledge across domains, we achieved
significant improvements over baseline methods on the benchmark datasets.Comment: Accepted in ECCV 201
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
Attend Refine Repeat: Active Box Proposal Generation via In-Out Localization
The problem of computing category agnostic bounding box proposals is utilized
as a core component in many computer vision tasks and thus has lately attracted
a lot of attention. In this work we propose a new approach to tackle this
problem that is based on an active strategy for generating box proposals that
starts from a set of seed boxes, which are uniformly distributed on the image,
and then progressively moves its attention on the promising image areas where
it is more likely to discover well localized bounding box proposals. We call
our approach AttractioNet and a core component of it is a CNN-based category
agnostic object location refinement module that is capable of yielding accurate
and robust bounding box predictions regardless of the object category.
We extensively evaluate our AttractioNet approach on several image datasets
(i.e. COCO, PASCAL, ImageNet detection and NYU-Depth V2 datasets) reporting on
all of them state-of-the-art results that surpass the previous work in the
field by a significant margin and also providing strong empirical evidence that
our approach is capable to generalize to unseen categories. Furthermore, we
evaluate our AttractioNet proposals in the context of the object detection task
using a VGG16-Net based detector and the achieved detection performance on COCO
manages to significantly surpass all other VGG16-Net based detectors while even
being competitive with a heavily tuned ResNet-101 based detector. Code as well
as box proposals computed for several datasets are available at::
https://github.com/gidariss/AttractioNet.Comment: Technical report. Code as well as box proposals computed for several
datasets are available at:: https://github.com/gidariss/AttractioNe
Pedestrian Attribute Recognition: A Survey
Recognizing pedestrian attributes is an important task in computer vision
community due to it plays an important role in video surveillance. Many
algorithms has been proposed to handle this task. The goal of this paper is to
review existing works using traditional methods or based on deep learning
networks. Firstly, we introduce the background of pedestrian attributes
recognition (PAR, for short), including the fundamental concepts of pedestrian
attributes and corresponding challenges. Secondly, we introduce existing
benchmarks, including popular datasets and evaluation criterion. Thirdly, we
analyse the concept of multi-task learning and multi-label learning, and also
explain the relations between these two learning algorithms and pedestrian
attribute recognition. We also review some popular network architectures which
have widely applied in the deep learning community. Fourthly, we analyse
popular solutions for this task, such as attributes group, part-based,
\emph{etc}. Fifthly, we shown some applications which takes pedestrian
attributes into consideration and achieve better performance. Finally, we
summarized this paper and give several possible research directions for
pedestrian attributes recognition. The project page of this paper can be found
from the following website:
\url{https://sites.google.com/view/ahu-pedestrianattributes/}.Comment: Check our project page for High Resolution version of this survey:
https://sites.google.com/view/ahu-pedestrianattributes
- …