5,489 research outputs found
Receptive Field Block Net for Accurate and Fast Object Detection
Current top-performing object detectors depend on deep CNN backbones, such as
ResNet-101 and Inception, benefiting from their powerful feature
representations but suffering from high computational costs. Conversely, some
lightweight model based detectors fulfil real time processing, while their
accuracies are often criticized. In this paper, we explore an alternative to
build a fast and accurate detector by strengthening lightweight features using
a hand-crafted mechanism. Inspired by the structure of Receptive Fields (RFs)
in human visual systems, we propose a novel RF Block (RFB) module, which takes
the relationship between the size and eccentricity of RFs into account, to
enhance the feature discriminability and robustness. We further assemble RFB to
the top of SSD, constructing the RFB Net detector. To evaluate its
effectiveness, experiments are conducted on two major benchmarks and the
results show that RFB Net is able to reach the performance of advanced very
deep detectors while keeping the real-time speed. Code is available at
https://github.com/ruinmessi/RFBNet.Comment: Accepted by ECCV 201
CoupleNet: Coupling Global Structure with Local Parts for Object Detection
The region-based Convolutional Neural Network (CNN) detectors such as Faster
R-CNN or R-FCN have already shown promising results for object detection by
combining the region proposal subnetwork and the classification subnetwork
together. Although R-FCN has achieved higher detection speed while keeping the
detection performance, the global structure information is ignored by the
position-sensitive score maps. To fully explore the local and global
properties, in this paper, we propose a novel fully convolutional network,
named as CoupleNet, to couple the global structure with local parts for object
detection. Specifically, the object proposals obtained by the Region Proposal
Network (RPN) are fed into the the coupling module which consists of two
branches. One branch adopts the position-sensitive RoI (PSRoI) pooling to
capture the local part information of the object, while the other employs the
RoI pooling to encode the global and context information. Next, we design
different coupling strategies and normalization ways to make full use of the
complementary advantages between the global and local branches. Extensive
experiments demonstrate the effectiveness of our approach. We achieve
state-of-the-art results on all three challenging datasets, i.e. a mAP of 82.7%
on VOC07, 80.4% on VOC12, and 34.4% on COCO. Codes will be made publicly
available.Comment: Accepted by ICCV 201
Speed/accuracy trade-offs for modern convolutional object detectors
The goal of this paper is to serve as a guide for selecting a detection
architecture that achieves the right speed/memory/accuracy balance for a given
application and platform. To this end, we investigate various ways to trade
accuracy for speed and memory usage in modern convolutional object detection
systems. A number of successful systems have been proposed in recent years, but
apples-to-apples comparisons are difficult due to different base feature
extractors (e.g., VGG, Residual Networks), different default image resolutions,
as well as different hardware and software platforms. We present a unified
implementation of the Faster R-CNN [Ren et al., 2015], R-FCN [Dai et al., 2016]
and SSD [Liu et al., 2015] systems, which we view as "meta-architectures" and
trace out the speed/accuracy trade-off curve created by using alternative
feature extractors and varying other critical parameters such as image size
within each of these meta-architectures. On one extreme end of this spectrum
where speed and memory are critical, we present a detector that achieves real
time speeds and can be deployed on a mobile device. On the opposite end in
which accuracy is critical, we present a detector that achieves
state-of-the-art performance measured on the COCO detection task.Comment: Accepted to CVPR 201
- …