16,258 research outputs found
Crowds, not Drones: Modeling Human Factors in Interactive Crowdsourcing
International audienceIn this vision paper, we propose SmartCrowd, an intelligent and adaptive crowdsourcing framework. Contrary to existing crowdsourcing systems, where the process of hiring workers (crowd), learning their skills, and evaluating the accuracy of tasks they perform are fragmented, siloed, and often ad-hoc, SmartCrowd foresees a paradigm shift in that process, considering unpredictability of human nature, namely human factors. SmartCrowd offers opportunities in making crowdsourcing intelligent through iterative interaction with the workers, and adaptively learning and improving the underlying processes. Both existing (majority of which do not require longer engagement from volatile and mostly non-recurrent workers) and next generation crowdsourcing applications (which require longer engagement from the crowd) stand to benefit from SmartCrowd. We outline the opportunities in SmartCrowd, and discuss the challenges and directions, that can potentially revolutionize the existing crowdsourcing landscape
Crowds, not Drones: Modeling Human Factors in Interactive Crowdsourcing
International audienceIn this vision paper, we propose SmartCrowd, an intelligent and adaptive crowdsourcing framework. Contrary to existing crowdsourcing systems, where the process of hiring workers (crowd), learning their skills, and evaluating the accuracy of tasks they perform are fragmented, siloed, and often ad-hoc, SmartCrowd foresees a paradigm shift in that process, considering unpredictability of human nature, namely human factors. SmartCrowd offers opportunities in making crowdsourcing intelligent through iterative interaction with the workers, and adaptively learning and improving the underlying processes. Both existing (majority of which do not require longer engagement from volatile and mostly non-recurrent workers) and next generation crowdsourcing applications (which require longer engagement from the crowd) stand to benefit from SmartCrowd. We outline the opportunities in SmartCrowd, and discuss the challenges and directions, that can potentially revolutionize the existing crowdsourcing landscape
Empirical Methodology for Crowdsourcing Ground Truth
The process of gathering ground truth data through human annotation is a
major bottleneck in the use of information extraction methods for populating
the Semantic Web. Crowdsourcing-based approaches are gaining popularity in the
attempt to solve the issues related to volume of data and lack of annotators.
Typically these practices use inter-annotator agreement as a measure of
quality. However, in many domains, such as event detection, there is ambiguity
in the data, as well as a multitude of perspectives of the information
examples. We present an empirically derived methodology for efficiently
gathering of ground truth data in a diverse set of use cases covering a variety
of domains and annotation tasks. Central to our approach is the use of
CrowdTruth metrics that capture inter-annotator disagreement. We show that
measuring disagreement is essential for acquiring a high quality ground truth.
We achieve this by comparing the quality of the data aggregated with CrowdTruth
metrics with majority vote, over a set of diverse crowdsourcing tasks: Medical
Relation Extraction, Twitter Event Identification, News Event Extraction and
Sound Interpretation. We also show that an increased number of crowd workers
leads to growth and stabilization in the quality of annotations, going against
the usual practice of employing a small number of annotators.Comment: in publication at the Semantic Web Journa
Leveraging Crowdsourcing Data For Deep Active Learning - An Application: Learning Intents in Alexa
This paper presents a generic Bayesian framework that enables any deep
learning model to actively learn from targeted crowds. Our framework inherits
from recent advances in Bayesian deep learning, and extends existing work by
considering the targeted crowdsourcing approach, where multiple annotators with
unknown expertise contribute an uncontrolled amount (often limited) of
annotations. Our framework leverages the low-rank structure in annotations to
learn individual annotator expertise, which then helps to infer the true labels
from noisy and sparse annotations. It provides a unified Bayesian model to
simultaneously infer the true labels and train the deep learning model in order
to reach an optimal learning efficacy. Finally, our framework exploits the
uncertainty of the deep learning model during prediction as well as the
annotators' estimated expertise to minimize the number of required annotations
and annotators for optimally training the deep learning model.
We evaluate the effectiveness of our framework for intent classification in
Alexa (Amazon's personal assistant), using both synthetic and real-world
datasets. Experiments show that our framework can accurately learn annotator
expertise, infer true labels, and effectively reduce the amount of annotations
in model training as compared to state-of-the-art approaches. We further
discuss the potential of our proposed framework in bridging machine learning
and crowdsourcing towards improved human-in-the-loop systems
Privacy in crowdsourcing:a systematic review
The advent of crowdsourcing has brought with it multiple privacy challenges. For example, essential monitoring activities, while necessary and unavoidable, also potentially compromise contributor privacy. We conducted an extensive literature review of the research related to the privacy aspects of crowdsourcing. Our investigation revealed interesting gender differences and also differences in terms of individual perceptions. We conclude by suggesting a number of future research directions.</p
Engineering Crowdsourced Stream Processing Systems
A crowdsourced stream processing system (CSP) is a system that incorporates
crowdsourced tasks in the processing of a data stream. This can be seen as
enabling crowdsourcing work to be applied on a sample of large-scale data at
high speed, or equivalently, enabling stream processing to employ human
intelligence. It also leads to a substantial expansion of the capabilities of
data processing systems. Engineering a CSP system requires the combination of
human and machine computation elements. From a general systems theory
perspective, this means taking into account inherited as well as emerging
properties from both these elements. In this paper, we position CSP systems
within a broader taxonomy, outline a series of design principles and evaluation
metrics, present an extensible framework for their design, and describe several
design patterns. We showcase the capabilities of CSP systems by performing a
case study that applies our proposed framework to the design and analysis of a
real system (AIDR) that classifies social media messages during time-critical
crisis events. Results show that compared to a pure stream processing system,
AIDR can achieve a higher data classification accuracy, while compared to a
pure crowdsourcing solution, the system makes better use of human workers by
requiring much less manual work effort
- …