219 research outputs found
Dilation-Erosion for Single-Frame Supervised Temporal Action Localization
To balance the annotation labor and the granularity of supervision,
single-frame annotation has been introduced in temporal action localization. It
provides a rough temporal location for an action but implicitly overstates the
supervision from the annotated-frame during training, leading to the confusion
between actions and backgrounds, i.e., action incompleteness and background
false positives. To tackle the two challenges, in this work, we present the
Snippet Classification model and the Dilation-Erosion module. In the
Dilation-Erosion module, we expand the potential action segments with a loose
criterion to alleviate the problem of action incompleteness and then remove the
background from the potential action segments to alleviate the problem of
action incompleteness. Relying on the single-frame annotation and the output of
the snippet classification, the Dilation-Erosion module mines pseudo
snippet-level ground-truth, hard backgrounds and evident backgrounds, which in
turn further trains the Snippet Classification model. It forms a cyclic
dependency. Furthermore, we propose a new embedding loss to aggregate the
features of action instances with the same label and separate the features of
actions from backgrounds. Experiments on THUMOS14 and ActivityNet 1.2 validate
the effectiveness of the proposed method. Code has been made publicly available
(https://github.com/LingJun123/single-frame-TAL).Comment: 28 pages, 8 figure
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
A Survey of Deep Learning-Based Object Detection
Object detection is one of the most important and challenging branches of
computer vision, which has been widely applied in peoples life, such as
monitoring security, autonomous driving and so on, with the purpose of locating
instances of semantic objects of a certain class. With the rapid development of
deep learning networks for detection tasks, the performance of object detectors
has been greatly improved. In order to understand the main development status
of object detection pipeline, thoroughly and deeply, in this survey, we first
analyze the methods of existing typical detection models and describe the
benchmark datasets. Afterwards and primarily, we provide a comprehensive
overview of a variety of object detection methods in a systematic manner,
covering the one-stage and two-stage detectors. Moreover, we list the
traditional and new applications. Some representative branches of object
detection are analyzed as well. Finally, we discuss the architecture of
exploiting these object detection methods to build an effective and efficient
system and point out a set of development trends to better follow the
state-of-the-art algorithms and further research.Comment: 30 pages,12 figure
- …