4,982 research outputs found
Crowd-Labeling Fashion Reviews with Quality Control
We present a new methodology for high-quality labeling in the fashion domain
with crowd workers instead of experts. We focus on the Aspect-Based Sentiment
Analysis task. Our methods filter out inaccurate input from crowd workers but
we preserve different worker labeling to capture the inherent high variability
of the opinions. We demonstrate the quality of labeled data based on Facebook's
FastText framework as a baseline
Crowd-labeling fashion reviews with quality control
We present a new methodology for high-quality labeling in the
fashion domain with crowd workers instead of experts. We focus
on the Aspect-Based Sentiment Analysis task. Our methods filter
out inaccurate input from crowd workers but we preserve differ-
ent worker labeling to capture the inherent high variability of the
opinions. We demonstrate the quality of labeled data based on
Facebook’s FastText framework as a baseline
Towards Understanding and Answering Multi-Sentence Recommendation Questions on Tourism
We introduce the first system towards the novel task of answering complex
multisentence recommendation questions in the tourism domain. Our solution uses
a pipeline of two modules: question understanding and answering. For question
understanding, we define an SQL-like query language that captures the semantic
intent of a question; it supports operators like subset, negation, preference
and similarity, which are often found in recommendation questions. We train and
compare traditional CRFs as well as bidirectional LSTM-based models for
converting a question to its semantic representation. We extend these models to
a semisupervised setting with partially labeled sequences gathered through
crowdsourcing. We find that our best model performs semi-supervised training of
BiDiLSTM+CRF with hand-designed features and CCM(Chang et al., 2007)
constraints. Finally, in an end to end QA system, our answering component
converts our question representation into queries fired on underlying knowledge
sources. Our experiments on two different answer corpora demonstrate that our
system can significantly outperform baselines with up to 20 pt higher accuracy
and 17 pt higher recall
Combining Crowd and Machines for Multi-predicate Item Screening
This paper discusses how crowd and machine classifiers can be efficiently
combined to screen items that satisfy a set of predicates. We show that this is
a recurring problem in many domains, present machine-human (hybrid) algorithms
that screen items efficiently and estimate the gain over human-only or
machine-only screening in terms of performance and cost. We further show how,
given a new classification problem and a set of classifiers of unknown accuracy
for the problem at hand, we can identify how to manage the cost-accuracy trade
off by progressively determining if we should spend budget to obtain test data
(to assess the accuracy of the given classifiers), or to train an ensemble of
classifiers, or whether we should leverage the existing machine classifiers
with the crowd, and in this case how to efficiently combine them based on their
estimated characteristics to obtain the classification. We demonstrate that the
techniques we propose obtain significant cost/accuracy improvements with
respect to the leading classification algorithms.Comment: Please cite the CSCW2018 version of this
paper:@article{krivosheev2018combining, title={Combining Crowd and Machines
for Multi-predicate Item Screening}, author={Krivosheev, Evgeny and Casati,
Fabio and Baez, Marcos and Benatallah, Boualem}, journal={Proceedings of the
ACM on Human-Computer Interaction}, volume={2}, number={CSCW}, pages={97},
year={2018}, publisher={ACM}
Pushing the Boundaries of Crowd-enabled Databases with Query-driven Schema Expansion
By incorporating human workers into the query execution process crowd-enabled
databases facilitate intelligent, social capabilities like completing missing
data at query time or performing cognitive operators. But despite all their
flexibility, crowd-enabled databases still maintain rigid schemas. In this
paper, we extend crowd-enabled databases by flexible query-driven schema
expansion, allowing the addition of new attributes to the database at query
time. However, the number of crowd-sourced mini-tasks to fill in missing values
may often be prohibitively large and the resulting data quality is doubtful.
Instead of simple crowd-sourcing to obtain all values individually, we leverage
the user-generated data found in the Social Web: By exploiting user ratings we
build perceptual spaces, i.e., highly-compressed representations of opinions,
impressions, and perceptions of large numbers of users. Using few training
samples obtained by expert crowd sourcing, we then can extract all missing data
automatically from the perceptual space with high quality and at low costs.
Extensive experiments show that our approach can boost both performance and
quality of crowd-enabled databases, while also providing the flexibility to
expand schemas in a query-driven fashion.Comment: VLDB201
Online Decision Making in Crowdsourcing Markets: Theoretical Challenges (Position Paper)
Over the past decade, crowdsourcing has emerged as a cheap and efficient
method of obtaining solutions to simple tasks that are difficult for computers
to solve but possible for humans. The popularity and promise of crowdsourcing
markets has led to both empirical and theoretical research on the design of
algorithms to optimize various aspects of these markets, such as the pricing
and assignment of tasks. Much of the existing theoretical work on crowdsourcing
markets has focused on problems that fall into the broad category of online
decision making; task requesters or the crowdsourcing platform itself make
repeated decisions about prices to set, workers to filter out, problems to
assign to specific workers, or other things. Often these decisions are complex,
requiring algorithms that learn about the distribution of available tasks or
workers over time and take into account the strategic (or sometimes irrational)
behavior of workers.
As human computation grows into its own field, the time is ripe to address
these challenges in a principled way. However, it appears very difficult to
capture all pertinent aspects of crowdsourcing markets in a single coherent
model. In this paper, we reflect on the modeling issues that inhibit
theoretical research on online decision making for crowdsourcing, and identify
some steps forward. This paper grew out of the authors' own frustration with
these issues, and we hope it will encourage the community to attempt to
understand, debate, and ultimately address them.
The authors welcome feedback for future revisions of this paper
CommuniSense: Crowdsourcing Road Hazards in Nairobi
Nairobi is one of the fastest growing metropolitan cities and a major
business and technology powerhouse in Africa. However, Nairobi currently lacks
monitoring technologies to obtain reliable data on traffic and road
infrastructure conditions. In this paper, we investigate the use of mobile
crowdsourcing as means to gather and document Nairobi's road quality
information. We first present the key findings of a city-wide road quality
survey about the perception of existing road quality conditions in Nairobi.
Based on the survey's findings, we then developed a mobile crowdsourcing
application, called CommuniSense, to collect road quality data. The application
serves as a tool for users to locate, describe, and photograph road hazards. We
tested our application through a two-week field study amongst 30 participants
to document various forms of road hazards from different areas in Nairobi. To
verify the authenticity of user-contributed reports from our field study, we
proposed to use online crowdsourcing using Amazon's Mechanical Turk (MTurk) to
verify whether submitted reports indeed depict road hazards. We found 92% of
user-submitted reports to match the MTurkers judgements. While our prototype
was designed and tested on a specific city, our methodology is applicable to
other developing cities.Comment: In Proceedings of 17th International Conference on Human-Computer
Interaction with Mobile Devices and Services (MobileHCI 2015
A Technical Survey on Statistical Modelling and Design Methods for Crowdsourcing Quality Control
Online crowdsourcing provides a scalable and inexpensive means to collect
knowledge (e.g. labels) about various types of data items (e.g. text, audio,
video). However, it is also known to result in large variance in the quality of
recorded responses which often cannot be directly used for training machine
learning systems. To resolve this issue, a lot of work has been conducted to
control the response quality such that low-quality responses cannot adversely
affect the performance of the machine learning systems. Such work is referred
to as the quality control for crowdsourcing. Past quality control research can
be divided into two major branches: quality control mechanism design and
statistical models. The first branch focuses on designing measures, thresholds,
interfaces and workflows for payment, gamification, question assignment and
other mechanisms that influence workers' behaviour. The second branch focuses
on developing statistical models to perform effective aggregation of responses
to infer correct responses. The two branches are connected as statistical
models (i) provide parameter estimates to support the measure and threshold
calculation, and (ii) encode modelling assumptions used to derive (theoretical)
performance guarantees for the mechanisms. There are surveys regarding each
branch but they lack technical details about the other branch. Our survey is
the first to bridge the two branches by providing technical details on how they
work together under frameworks that systematically unify crowdsourcing aspects
modelled by both of them to determine the response quality. We are also the
first to provide taxonomies of quality control papers based on the proposed
frameworks. Finally, we specify the current limitations and the corresponding
future directions for the quality control research
A Labeling Task Design for Supporting Algorithmic Needs: Facilitating Worker Diversity and Reducing AI Bias
Studies on supervised machine learning (ML) recommend involving workers from
various backgrounds in training dataset labeling to reduce algorithmic bias.
Moreover, sophisticated tasks for categorizing objects in images are necessary
to improve ML performance, further complicating micro-tasks. This study aims to
develop a task design incorporating the fair participation of people,
regardless of their specific backgrounds or task's difficulty. By collaborating
with 75 labelers from diverse backgrounds for 3 months, we analyzed workers'
log-data and relevant narratives to identify the task's hurdles and helpers.
The findings revealed that workers' decision-making tendencies varied depending
on their backgrounds. We found that the community that positively helps workers
and the machine's feedback perceived by workers could make people easily
engaged in works. Hence, ML's bias could be expectedly mitigated. Based on
these findings, we suggest an extended human-in-the-loop approach that connects
labelers, machines, and communities rather than isolating individual workers.Comment: 45 pages, 4 figure
Learning-Based Procedural Content Generation
Procedural content generation (PCG) has recently become one of the hottest
topics in computational intelligence and AI game researches. Among a variety of
PCG techniques, search-based approaches overwhelmingly dominate PCG development
at present. While SBPCG leads to promising results and successful applications,
it poses a number of challenges ranging from representation to evaluation of
the content being generated. In this paper, we present an alternative yet
generic PCG framework, named learning-based procedure content generation
(LBPCG), to provide potential solutions to several challenging problems in
existing PCG techniques. By exploring and exploiting information gained in game
development and public beta test via data-driven learning, our framework can
generate robust content adaptable to end-user or target players on-line with
minimal interruption to their experience. Furthermore, we develop enabling
techniques to implement the various models required in our framework. For a
proof of concept, we have developed a prototype based on the classic open
source first-person shooter game, Quake. Simulation results suggest that our
framework is promising in generating quality content.Comment: 13 pages, 9 figures, manuscript submitted to IEEE Transactions on
Computational Intelligence and AI Games (Also a technical report, School of
Computer Science, The University of Manchester
- …