28,080 research outputs found
Selective Refinement Network for High Performance Face Detection
High performance face detection remains a very challenging problem,
especially when there exists many tiny faces. This paper presents a novel
single-shot face detector, named Selective Refinement Network (SRN), which
introduces novel two-step classification and regression operations selectively
into an anchor-based face detector to reduce false positives and improve
location accuracy simultaneously. In particular, the SRN consists of two
modules: the Selective Two-step Classification (STC) module and the Selective
Two-step Regression (STR) module. The STC aims to filter out most simple
negative anchors from low level detection layers to reduce the search space for
the subsequent classifier, while the STR is designed to coarsely adjust the
locations and sizes of anchors from high level detection layers to provide
better initialization for the subsequent regressor. Moreover, we design a
Receptive Field Enhancement (RFE) block to provide more diverse receptive
field, which helps to better capture faces in some extreme poses. As a
consequence, the proposed SRN detector achieves state-of-the-art performance on
all the widely used face detection benchmarks, including AFW, PASCAL face,
FDDB, and WIDER FACE datasets. Codes will be released to facilitate further
studies on the face detection problem.Comment: The first two authors have equal contributions. Corresponding author:
Shifeng Zhang ([email protected]
Weakly Supervised Object Localization with Multi-fold Multiple Instance Learning
Object category localization is a challenging problem in computer vision.
Standard supervised training requires bounding box annotations of object
instances. This time-consuming annotation process is sidestepped in weakly
supervised learning. In this case, the supervised information is restricted to
binary labels that indicate the absence/presence of object instances in the
image, without their locations. We follow a multiple-instance learning approach
that iteratively trains the detector and infers the object locations in the
positive training images. Our main contribution is a multi-fold multiple
instance learning procedure, which prevents training from prematurely locking
onto erroneous object locations. This procedure is particularly important when
using high-dimensional representations, such as Fisher vectors and
convolutional neural network features. We also propose a window refinement
method, which improves the localization accuracy by incorporating an objectness
prior. We present a detailed experimental evaluation using the PASCAL VOC 2007
dataset, which verifies the effectiveness of our approach.Comment: To appear in IEEE Transactions on Pattern Analysis and Machine
Intelligence (TPAMI
Single-Shot Refinement Neural Network for Object Detection
For object detection, the two-stage approach (e.g., Faster R-CNN) has been
achieving the highest accuracy, whereas the one-stage approach (e.g., SSD) has
the advantage of high efficiency. To inherit the merits of both while
overcoming their disadvantages, in this paper, we propose a novel single-shot
based detector, called RefineDet, that achieves better accuracy than two-stage
methods and maintains comparable efficiency of one-stage methods. RefineDet
consists of two inter-connected modules, namely, the anchor refinement module
and the object detection module. Specifically, the former aims to (1) filter
out negative anchors to reduce search space for the classifier, and (2)
coarsely adjust the locations and sizes of anchors to provide better
initialization for the subsequent regressor. The latter module takes the
refined anchors as the input from the former to further improve the regression
and predict multi-class label. Meanwhile, we design a transfer connection block
to transfer the features in the anchor refinement module to predict locations,
sizes and class labels of objects in the object detection module. The
multi-task loss function enables us to train the whole network in an end-to-end
way. Extensive experiments on PASCAL VOC 2007, PASCAL VOC 2012, and MS COCO
demonstrate that RefineDet achieves state-of-the-art detection accuracy with
high efficiency. Code is available at https://github.com/sfzhang15/RefineDetComment: 14 pages, 7 figures, 7 table
- …