37,597 research outputs found
Coarse-to-Fine Adaptive People Detection for Video Sequences by Maximizing Mutual Information
Applying people detectors to unseen data is challenging since patterns distributions, such
as viewpoints, motion, poses, backgrounds, occlusions and people sizes, may significantly differ
from the ones of the training dataset. In this paper, we propose a coarse-to-fine framework to adapt
frame by frame people detectors during runtime classification, without requiring any additional
manually labeled ground truth apart from the offline training of the detection model. Such adaptation
make use of multiple detectors mutual information, i.e., similarities and dissimilarities of detectors
estimated and agreed by pair-wise correlating their outputs. Globally, the proposed adaptation
discriminates between relevant instants in a video sequence, i.e., identifies the representative frames
for an adaptation of the system. Locally, the proposed adaptation identifies the best configuration
(i.e., detection threshold) of each detector under analysis, maximizing the mutual information to
obtain the detection threshold of each detector. The proposed coarse-to-fine approach does not
require training the detectors for each new scenario and uses standard people detector outputs, i.e.,
bounding boxes. The experimental results demonstrate that the proposed approach outperforms
state-of-the-art detectors whose optimal threshold configurations are previously determined and
fixed from offline training dataThis work has been partially supported by the Spanish government under the project TEC2014-53176-R
(HAVideo
Multiple Instance Curriculum Learning for Weakly Supervised Object Detection
When supervising an object detector with weakly labeled data, most existing
approaches are prone to trapping in the discriminative object parts, e.g.,
finding the face of a cat instead of the full body, due to lacking the
supervision on the extent of full objects. To address this challenge, we
incorporate object segmentation into the detector training, which guides the
model to correctly localize the full objects. We propose the multiple instance
curriculum learning (MICL) method, which injects curriculum learning (CL) into
the multiple instance learning (MIL) framework. The MICL method starts by
automatically picking the easy training examples, where the extent of the
segmentation masks agree with detection bounding boxes. The training set is
gradually expanded to include harder examples to train strong detectors that
handle complex images. The proposed MICL method with segmentation in the loop
outperforms the state-of-the-art weakly supervised object detectors by a
substantial margin on the PASCAL VOC datasets.Comment: Published in BMVC 201
- …