730 research outputs found
What Can Help Pedestrian Detection?
Aggregating extra features has been considered as an effective approach to
boost traditional pedestrian detection methods. However, there is still a lack
of studies on whether and how CNN-based pedestrian detectors can benefit from
these extra features. The first contribution of this paper is exploring this
issue by aggregating extra features into CNN-based pedestrian detection
framework. Through extensive experiments, we evaluate the effects of different
kinds of extra features quantitatively. Moreover, we propose a novel network
architecture, namely HyperLearner, to jointly learn pedestrian detection as
well as the given extra feature. By multi-task training, HyperLearner is able
to utilize the information of given features and improve detection performance
without extra inputs in inference. The experimental results on multiple
pedestrian benchmarks validate the effectiveness of the proposed HyperLearner.Comment: Accepted to IEEE International Conference on Computer Vision and
Pattern Recognition (CVPR) 201
Towards Distributed OPF using ALADIN
The present paper discusses the application of the recently proposed
Augmented Lagrangian Alternating Direction Inexact Newton (ALADIN) method to
non-convex AC Optimal Power Flow Problems (OPF) in a distributed fashion. In
contrast to the often used Alternating Direction of Multipliers Method (ADMM),
ALADIN guarantees locally quadratic convergence for AC OPF. Numerical results
for 5 to 300 bus test cases indicate that ALADIN is able to outperform ADMM and
to reduce the number of iterations by about one order of magnitude. We compare
ALADIN to numerical results for ADMM documented in the literature. The improved
convergence speed comes at the cost of increasing the communication effort per
iteration. Therefore, we propose a variant of ALADIN that uses inexact Hessians
to reduce communication. Additionally, we provide a detailed comparison of
these ALADIN variants to ADMM from an algorithmic and communication
perspective. Moreover, we prove that ALADIN converges locally at quadratic rate
even for the relevant case of suboptimally solved local NLPs
FoveaBox: Beyond Anchor-based Object Detector
We present FoveaBox, an accurate, flexible, and completely anchor-free
framework for object detection. While almost all state-of-the-art object
detectors utilize predefined anchors to enumerate possible locations, scales
and aspect ratios for the search of the objects, their performance and
generalization ability are also limited to the design of anchors. Instead,
FoveaBox directly learns the object existing possibility and the bounding box
coordinates without anchor reference. This is achieved by: (a) predicting
category-sensitive semantic maps for the object existing possibility, and (b)
producing category-agnostic bounding box for each position that potentially
contains an object. The scales of target boxes are naturally associated with
feature pyramid representations. In FoveaBox, an instance is assigned to
adjacent feature levels to make the model more accurate.We demonstrate its
effectiveness on standard benchmarks and report extensive experimental
analysis. Without bells and whistles, FoveaBox achieves state-of-the-art single
model performance on the standard COCO and Pascal VOC object detection
benchmark. More importantly, FoveaBox avoids all computation and
hyper-parameters related to anchor boxes, which are often sensitive to the
final detection performance. We believe the simple and effective approach will
serve as a solid baseline and help ease future research for object detection.
The code has been made publicly available at
https://github.com/taokong/FoveaBox .Comment: IEEE Transactions on Image Processing, code at:
https://github.com/taokong/FoveaBo
Distributed State Estimation for AC Power Systems using Gauss-Newton ALADIN
This paper proposes a structure exploiting algorithm for solving non-convex
power system state estimation problems in distributed fashion. Because the
power flow equations in large electrical grid networks are non-convex equality
constraints, we develop a tailored state estimator based on Augmented
Lagrangian Alternating Direction Inexact Newton (ALADIN) method, which can
handle the nonlinearities efficiently. Here, our focus is on using Gauss-Newton
Hessian approximations within ALADIN in order to arrive at at an efficient
(computationally and communicationally) variant of ALADIN for network maximum
likelihood estimation problems. Analyzing the IEEE 30-Bus system we illustrate
how the proposed algorithm can be used to solve highly non-trivial network
state estimation problems. We also compare the method with existing distributed
parameter estimation codes in order to illustrate its performance
Repulsion Loss: Detecting Pedestrians in a Crowd
Detecting individual pedestrians in a crowd remains a challenging problem
since the pedestrians often gather together and occlude each other in
real-world scenarios. In this paper, we first explore how a state-of-the-art
pedestrian detector is harmed by crowd occlusion via experimentation, providing
insights into the crowd occlusion problem. Then, we propose a novel bounding
box regression loss specifically designed for crowd scenes, termed repulsion
loss. This loss is driven by two motivations: the attraction by target, and the
repulsion by other surrounding objects. The repulsion term prevents the
proposal from shifting to surrounding objects thus leading to more crowd-robust
localization. Our detector trained by repulsion loss outperforms all the
state-of-the-art methods with a significant improvement in occlusion cases.Comment: Accepted to IEEE Conference on Computer Vision and Pattern
Recognition (CVPR) 201
MegDet: A Large Mini-Batch Object Detector
The improvements in recent CNN-based object detection works, from R-CNN [11],
Fast/Faster R-CNN [10, 31] to recent Mask R-CNN [14] and RetinaNet [24], mainly
come from new network, new framework, or novel loss design. But mini-batch
size, a key factor in the training, has not been well studied. In this paper,
we propose a Large MiniBatch Object Detector (MegDet) to enable the training
with much larger mini-batch size than before (e.g. from 16 to 256), so that we
can effectively utilize multiple GPUs (up to 128 in our experiments) to
significantly shorten the training time. Technically, we suggest a learning
rate policy and Cross-GPU Batch Normalization, which together allow us to
successfully train a large mini-batch detector in much less time (e.g., from 33
hours to 4 hours), and achieve even better accuracy. The MegDet is the backbone
of our submission (mmAP 52.5%) to COCO 2017 Challenge, where we won the 1st
place of Detection task
- …
