6,552 research outputs found
Watch and Learn: Semi-Supervised Learning of Object Detectors from Videos
We present a semi-supervised approach that localizes multiple unknown object
instances in long videos. We start with a handful of labeled boxes and
iteratively learn and label hundreds of thousands of object instances. We
propose criteria for reliable object detection and tracking for constraining
the semi-supervised learning process and minimizing semantic drift. Our
approach does not assume exhaustive labeling of each object instance in any
single frame, or any explicit annotation of negative data. Working in such a
generic setting allow us to tackle multiple object instances in video, many of
which are static. In contrast, existing approaches either do not consider
multiple object instances per video, or rely heavily on the motion of the
objects present. The experiments demonstrate the effectiveness of our approach
by evaluating the automatically labeled data on a variety of metrics like
quality, coverage (recall), diversity, and relevance to training an object
detector.Comment: To appear in CVPR 201
- …