21,513 research outputs found
Towards High Performance Video Object Detection
There has been significant progresses for image object detection in recent
years. Nevertheless, video object detection has received little attention,
although it is more challenging and more important in practical scenarios.
Built upon the recent works, this work proposes a unified approach based on
the principle of multi-frame end-to-end learning of features and cross-frame
motion. Our approach extends prior works with three new techniques and steadily
pushes forward the performance envelope (speed-accuracy tradeoff), towards high
performance video object detection
Curriculum Dropout
Dropout is a very effective way of regularizing neural networks.
Stochastically "dropping out" units with a certain probability discourages
over-specific co-adaptations of feature detectors, preventing overfitting and
improving network generalization. Besides, Dropout can be interpreted as an
approximate model aggregation technique, where an exponential number of smaller
networks are averaged in order to get a more powerful ensemble. In this paper,
we show that using a fixed dropout probability during training is a suboptimal
choice. We thus propose a time scheduling for the probability of retaining
neurons in the network. This induces an adaptive regularization scheme that
smoothly increases the difficulty of the optimization problem. This idea of
"starting easy" and adaptively increasing the difficulty of the learning
problem has its roots in curriculum learning and allows one to train better
models. Indeed, we prove that our optimization strategy implements a very
general curriculum scheme, by gradually adding noise to both the input and
intermediate feature representations within the network architecture.
Experiments on seven image classification datasets and different network
architectures show that our method, named Curriculum Dropout, frequently yields
to better generalization and, at worst, performs just as well as the standard
Dropout method.Comment: Accepted at ICCV (International Conference on Computer Vision) 201
- …