1,365 research outputs found
Watch and Learn: Semi-Supervised Learning of Object Detectors from Videos
We present a semi-supervised approach that localizes multiple unknown object
instances in long videos. We start with a handful of labeled boxes and
iteratively learn and label hundreds of thousands of object instances. We
propose criteria for reliable object detection and tracking for constraining
the semi-supervised learning process and minimizing semantic drift. Our
approach does not assume exhaustive labeling of each object instance in any
single frame, or any explicit annotation of negative data. Working in such a
generic setting allow us to tackle multiple object instances in video, many of
which are static. In contrast, existing approaches either do not consider
multiple object instances per video, or rely heavily on the motion of the
objects present. The experiments demonstrate the effectiveness of our approach
by evaluating the automatically labeled data on a variety of metrics like
quality, coverage (recall), diversity, and relevance to training an object
detector.Comment: To appear in CVPR 201
Budget-aware Semi-Supervised Semantic and Instance Segmentation
Methods that move towards less supervised scenarios are key for image
segmentation, as dense labels demand significant human intervention. Generally,
the annotation burden is mitigated by labeling datasets with weaker forms of
supervision, e.g. image-level labels or bounding boxes. Another option are
semi-supervised settings, that commonly leverage a few strong annotations and a
huge number of unlabeled/weakly-labeled data. In this paper, we revisit
semi-supervised segmentation schemes and narrow down significantly the
annotation budget (in terms of total labeling time of the training set)
compared to previous approaches. With a very simple pipeline, we demonstrate
that at low annotation budgets, semi-supervised methods outperform by a wide
margin weakly-supervised ones for both semantic and instance segmentation. Our
approach also outperforms previous semi-supervised works at a much reduced
labeling cost. We present results for the Pascal VOC benchmark and unify weakly
and semi-supervised approaches by considering the total annotation budget, thus
allowing a fairer comparison between methods.Comment: To appear in CVPR-W 2019 (DeepVision workshop
Learning to count with deep object features
Learning to count is a learning strategy that has been recently proposed in
the literature for dealing with problems where estimating the number of object
instances in a scene is the final objective. In this framework, the task of
learning to detect and localize individual object instances is seen as a harder
task that can be evaded by casting the problem as that of computing a
regression value from hand-crafted image features. In this paper we explore the
features that are learned when training a counting convolutional neural network
in order to understand their underlying representation. To this end we define a
counting problem for MNIST data and show that the internal representation of
the network is able to classify digits in spite of the fact that no direct
supervision was provided for them during training. We also present preliminary
results about a deep network that is able to count the number of pedestrians in
a scene.Comment: This paper has been accepted at Deep Vision Workshop at CVPR 201
- …