8,416 research outputs found
Trespassing the Boundaries: Labeling Temporal Bounds for Object Interactions in Egocentric Video
Manual annotations of temporal bounds for object interactions (i.e. start and
end times) are typical training input to recognition, localization and
detection algorithms. For three publicly available egocentric datasets, we
uncover inconsistencies in ground truth temporal bounds within and across
annotators and datasets. We systematically assess the robustness of
state-of-the-art approaches to changes in labeled temporal bounds, for object
interaction recognition. As boundaries are trespassed, a drop of up to 10% is
observed for both Improved Dense Trajectories and Two-Stream Convolutional
Neural Network.
We demonstrate that such disagreement stems from a limited understanding of
the distinct phases of an action, and propose annotating based on the Rubicon
Boundaries, inspired by a similarly named cognitive model, for consistent
temporal bounds of object interactions. Evaluated on a public dataset, we
report a 4% increase in overall accuracy, and an increase in accuracy for 55%
of classes when Rubicon Boundaries are used for temporal annotations.Comment: ICCV 201
Crowdsourcing in Computer Vision
Computer vision systems require large amounts of manually annotated data to
properly learn challenging visual concepts. Crowdsourcing platforms offer an
inexpensive method to capture human knowledge and understanding, for a vast
number of visual perception tasks. In this survey, we describe the types of
annotations computer vision researchers have collected using crowdsourcing, and
how they have ensured that this data is of high quality while annotation effort
is minimized. We begin by discussing data collection on both classic (e.g.,
object recognition) and recent (e.g., visual story-telling) vision tasks. We
then summarize key design decisions for creating effective data collection
interfaces and workflows, and present strategies for intelligently selecting
the most important data instances to annotate. Finally, we conclude with some
thoughts on the future of crowdsourcing in computer vision.Comment: A 69-page meta review of the field, Foundations and Trends in
Computer Graphics and Vision, 201
- …