5,159 research outputs found
A Glimpse Far into the Future: Understanding Long-term Crowd Worker Quality
Microtask crowdsourcing is increasingly critical to the creation of extremely
large datasets. As a result, crowd workers spend weeks or months repeating the
exact same tasks, making it necessary to understand their behavior over these
long periods of time. We utilize three large, longitudinal datasets of nine
million annotations collected from Amazon Mechanical Turk to examine claims
that workers fatigue or satisfice over these long periods, producing lower
quality work. We find that, contrary to these claims, workers are extremely
stable in their quality over the entire period. To understand whether workers
set their quality based on the task's requirements for acceptance, we then
perform an experiment where we vary the required quality for a large
crowdsourcing task. Workers did not adjust their quality based on the
acceptance threshold: workers who were above the threshold continued working at
their usual quality level, and workers below the threshold self-selected
themselves out of the task. Capitalizing on this consistency, we demonstrate
that it is possible to predict workers' long-term quality using just a glimpse
of their quality on the first five tasks.Comment: 10 pages, 11 figures, accepted CSCW 201
Crowdsourcing in Computer Vision
Computer vision systems require large amounts of manually annotated data to
properly learn challenging visual concepts. Crowdsourcing platforms offer an
inexpensive method to capture human knowledge and understanding, for a vast
number of visual perception tasks. In this survey, we describe the types of
annotations computer vision researchers have collected using crowdsourcing, and
how they have ensured that this data is of high quality while annotation effort
is minimized. We begin by discussing data collection on both classic (e.g.,
object recognition) and recent (e.g., visual story-telling) vision tasks. We
then summarize key design decisions for creating effective data collection
interfaces and workflows, and present strategies for intelligently selecting
the most important data instances to annotate. Finally, we conclude with some
thoughts on the future of crowdsourcing in computer vision.Comment: A 69-page meta review of the field, Foundations and Trends in
Computer Graphics and Vision, 201
Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding
Computer vision has a great potential to help our daily lives by searching
for lost keys, watering flowers or reminding us to take a pill. To succeed with
such tasks, computer vision methods need to be trained from real and diverse
examples of our daily dynamic scenes. While most of such scenes are not
particularly exciting, they typically do not appear on YouTube, in movies or TV
broadcasts. So how do we collect sufficiently many diverse but boring samples
representing our lives? We propose a novel Hollywood in Homes approach to
collect such data. Instead of shooting videos in the lab, we ensure diversity
by distributing and crowdsourcing the whole process of video creation from
script writing to video recording and annotation. Following this procedure we
collect a new dataset, Charades, with hundreds of people recording videos in
their own homes, acting out casual everyday activities. The dataset is composed
of 9,848 annotated videos with an average length of 30 seconds, showing
activities of 267 people from three continents. Each video is annotated by
multiple free-text descriptions, action labels, action intervals and classes of
interacted objects. In total, Charades provides 27,847 video descriptions,
66,500 temporally localized intervals for 157 action classes and 41,104 labels
for 46 object classes. Using this rich data, we evaluate and provide baseline
results for several tasks including action recognition and automatic
description generation. We believe that the realism, diversity, and casual
nature of this dataset will present unique challenges and new opportunities for
computer vision community
An introduction to crowdsourcing for language and multimedia technology research
Language and multimedia technology research often relies on
large manually constructed datasets for training or evaluation of algorithms and systems. Constructing these datasets is often expensive with significant challenges in terms of recruitment of personnel to carry out the work. Crowdsourcing methods using scalable pools of workers available on-demand offers a flexible means of rapid low-cost construction of many of these datasets to support existing research requirements and potentially promote new research initiatives that would otherwise not be possible
CLAD: A Complex and Long Activities Dataset with Rich Crowdsourced Annotations
This paper introduces a novel activity dataset which exhibits real-life and
diverse scenarios of complex, temporally-extended human activities and actions.
The dataset presents a set of videos of actors performing everyday activities
in a natural and unscripted manner. The dataset was recorded using a static
Kinect 2 sensor which is commonly used on many robotic platforms. The dataset
comprises of RGB-D images, point cloud data, automatically generated skeleton
tracks in addition to crowdsourced annotations. Furthermore, we also describe
the methodology used to acquire annotations through crowdsourcing. Finally some
activity recognition benchmarks are presented using current state-of-the-art
techniques. We believe that this dataset is particularly suitable as a testbed
for activity recognition research but it can also be applicable for other
common tasks in robotics/computer vision research such as object detection and
human skeleton tracking
- …