10,934 research outputs found
Accurate Single Stage Detector Using Recurrent Rolling Convolution
Most of the recent successful methods in accurate object detection and
localization used some variants of R-CNN style two stage Convolutional Neural
Networks (CNN) where plausible regions were proposed in the first stage then
followed by a second stage for decision refinement. Despite the simplicity of
training and the efficiency in deployment, the single stage detection methods
have not been as competitive when evaluated in benchmarks consider mAP for high
IoU thresholds. In this paper, we proposed a novel single stage end-to-end
trainable object detection network to overcome this limitation. We achieved
this by introducing Recurrent Rolling Convolution (RRC) architecture over
multi-scale feature maps to construct object classifiers and bounding box
regressors which are "deep in context". We evaluated our method in the
challenging KITTI dataset which measures methods under IoU threshold of 0.7. We
showed that with RRC, a single reduced VGG-16 based model already significantly
outperformed all the previously published results. At the time this paper was
written our models ranked the first in KITTI car detection (the hard level),
the first in cyclist detection and the second in pedestrian detection. These
results were not reached by the previous single stage methods. The code is
publicly available.Comment: CVPR 201
SINet: A Scale-insensitive Convolutional Neural Network for Fast Vehicle Detection
Vision-based vehicle detection approaches achieve incredible success in
recent years with the development of deep convolutional neural network (CNN).
However, existing CNN based algorithms suffer from the problem that the
convolutional features are scale-sensitive in object detection task but it is
common that traffic images and videos contain vehicles with a large variance of
scales. In this paper, we delve into the source of scale sensitivity, and
reveal two key issues: 1) existing RoI pooling destroys the structure of small
scale objects, 2) the large intra-class distance for a large variance of scales
exceeds the representation capability of a single network. Based on these
findings, we present a scale-insensitive convolutional neural network (SINet)
for fast detecting vehicles with a large variance of scales. First, we present
a context-aware RoI pooling to maintain the contextual information and original
structure of small scale objects. Second, we present a multi-branch decision
network to minimize the intra-class distance of features. These lightweight
techniques bring zero extra time complexity but prominent detection accuracy
improvement. The proposed techniques can be equipped with any deep network
architectures and keep them trained end-to-end. Our SINet achieves
state-of-the-art performance in terms of accuracy and speed (up to 37 FPS) on
the KITTI benchmark and a new highway dataset, which contains a large variance
of scales and extremely small objects.Comment: Accepted by IEEE Transactions on Intelligent Transportation Systems
(T-ITS
Learning Complexity-Aware Cascades for Deep Pedestrian Detection
The design of complexity-aware cascaded detectors, combining features of very
different complexities, is considered. A new cascade design procedure is
introduced, by formulating cascade learning as the Lagrangian optimization of a
risk that accounts for both accuracy and complexity. A boosting algorithm,
denoted as complexity aware cascade training (CompACT), is then derived to
solve this optimization. CompACT cascades are shown to seek an optimal
trade-off between accuracy and complexity by pushing features of higher
complexity to the later cascade stages, where only a few difficult candidate
patches remain to be classified. This enables the use of features of vastly
different complexities in a single detector. In result, the feature pool can be
expanded to features previously impractical for cascade design, such as the
responses of a deep convolutional neural network (CNN). This is demonstrated
through the design of a pedestrian detector with a pool of features whose
complexities span orders of magnitude. The resulting cascade generalizes the
combination of a CNN with an object proposal mechanism: rather than a
pre-processing stage, CompACT cascades seamlessly integrate CNNs in their
stages. This enables state of the art performance on the Caltech and KITTI
datasets, at fairly fast speeds
- …