75,045 research outputs found
Towards Accurate One-Stage Object Detection with AP-Loss
One-stage object detectors are trained by optimizing classification-loss and
localization-loss simultaneously, with the former suffering much from extreme
foreground-background class imbalance issue due to the large number of anchors.
This paper alleviates this issue by proposing a novel framework to replace the
classification task in one-stage detectors with a ranking task, and adopting
the Average-Precision loss (AP-loss) for the ranking problem. Due to its
non-differentiability and non-convexity, the AP-loss cannot be optimized
directly. For this purpose, we develop a novel optimization algorithm, which
seamlessly combines the error-driven update scheme in perceptron learning and
backpropagation algorithm in deep networks. We verify good convergence property
of the proposed algorithm theoretically and empirically. Experimental results
demonstrate notable performance improvement in state-of-the-art one-stage
detectors based on AP-loss over different kinds of classification-losses on
various benchmarks, without changing the network architectures. Code is
available at https://github.com/cccorn/AP-loss.Comment: 13 pages, 7 figures, 4 tables, main paper + supplementary material,
accepted to CVPR 201
Cascade R-CNN: Delving into High Quality Object Detection
In object detection, an intersection over union (IoU) threshold is required
to define positives and negatives. An object detector, trained with low IoU
threshold, e.g. 0.5, usually produces noisy detections. However, detection
performance tends to degrade with increasing the IoU thresholds. Two main
factors are responsible for this: 1) overfitting during training, due to
exponentially vanishing positive samples, and 2) inference-time mismatch
between the IoUs for which the detector is optimal and those of the input
hypotheses. A multi-stage object detection architecture, the Cascade R-CNN, is
proposed to address these problems. It consists of a sequence of detectors
trained with increasing IoU thresholds, to be sequentially more selective
against close false positives. The detectors are trained stage by stage,
leveraging the observation that the output of a detector is a good distribution
for training the next higher quality detector. The resampling of progressively
improved hypotheses guarantees that all detectors have a positive set of
examples of equivalent size, reducing the overfitting problem. The same cascade
procedure is applied at inference, enabling a closer match between the
hypotheses and the detector quality of each stage. A simple implementation of
the Cascade R-CNN is shown to surpass all single-model object detectors on the
challenging COCO dataset. Experiments also show that the Cascade R-CNN is
widely applicable across detector architectures, achieving consistent gains
independently of the baseline detector strength. The code will be made
available at https://github.com/zhaoweicai/cascade-rcnn
Single-Shot Refinement Neural Network for Object Detection
For object detection, the two-stage approach (e.g., Faster R-CNN) has been
achieving the highest accuracy, whereas the one-stage approach (e.g., SSD) has
the advantage of high efficiency. To inherit the merits of both while
overcoming their disadvantages, in this paper, we propose a novel single-shot
based detector, called RefineDet, that achieves better accuracy than two-stage
methods and maintains comparable efficiency of one-stage methods. RefineDet
consists of two inter-connected modules, namely, the anchor refinement module
and the object detection module. Specifically, the former aims to (1) filter
out negative anchors to reduce search space for the classifier, and (2)
coarsely adjust the locations and sizes of anchors to provide better
initialization for the subsequent regressor. The latter module takes the
refined anchors as the input from the former to further improve the regression
and predict multi-class label. Meanwhile, we design a transfer connection block
to transfer the features in the anchor refinement module to predict locations,
sizes and class labels of objects in the object detection module. The
multi-task loss function enables us to train the whole network in an end-to-end
way. Extensive experiments on PASCAL VOC 2007, PASCAL VOC 2012, and MS COCO
demonstrate that RefineDet achieves state-of-the-art detection accuracy with
high efficiency. Code is available at https://github.com/sfzhang15/RefineDetComment: 14 pages, 7 figures, 7 table
- …