5,107 research outputs found
Reliable Crowdsourcing for Multi-Class Labeling using Coding Theory
Crowdsourcing systems often have crowd workers that perform unreliable work
on the task they are assigned. In this paper, we propose the use of
error-control codes and decoding algorithms to design crowdsourcing systems for
reliable classification despite unreliable crowd workers. Coding-theory based
techniques also allow us to pose easy-to-answer binary questions to the crowd
workers. We consider three different crowdsourcing models: systems with
independent crowd workers, systems with peer-dependent reward schemes, and
systems where workers have common sources of information. For each of these
models, we analyze classification performance with the proposed coding-based
scheme. We develop an ordering principle for the quality of crowds and describe
how system performance changes with the quality of the crowd. We also show that
pairing among workers and diversification of the questions help in improving
system performance. We demonstrate the effectiveness of the proposed
coding-based scheme using both simulated data and real datasets from Amazon
Mechanical Turk, a crowdsourcing microtask platform. Results suggest that use
of good codes may improve the performance of the crowdsourcing task over
typical majority-voting approaches.Comment: 20 pages, 11 figures, under revision, IEEE Journal of Selected Topics
  in Signal Processin
Multi-object Classification via Crowdsourcing with a Reject Option
Consider designing an effective crowdsourcing system for an -ary
classification task. Crowd workers complete simple binary microtasks whose
results are aggregated to give the final result. We consider the novel scenario
where workers have a reject option so they may skip microtasks when they are
unable or choose not to respond. For example, in mismatched speech
transcription, workers who do not know the language may not be able to respond
to microtasks focused on phonological dimensions outside their categorical
perception. We present an aggregation approach using a weighted majority voting
rule, where each worker's response is assigned an optimized weight to maximize
the crowd's classification performance. We evaluate system performance in both
exact and asymptotic forms. Further, we consider the setting where there may be
a set of greedy workers that complete microtasks even when they are unable to
perform it reliably. We consider an oblivious and an expurgation strategy to
deal with greedy workers, developing an algorithm to adaptively switch between
the two based on the estimated fraction of greedy workers in the anonymous
crowd. Simulation results show improved performance compared with conventional
majority voting.Comment: two column, 15 pages, 8 figures, submitted to IEEE Trans. Signal
  Proces
The Non-Regular CEO Problem
We consider the CEO problem for non-regular source distributions (such as
uniform or truncated Gaussian). A group of agents observe independently
corrupted versions of data and transmit coded versions over rate-limited links
to a CEO. The CEO then estimates the underlying data based on the received
coded observations. Agents are not allowed to convene before transmitting their
observations. This formulation is motivated by the practical problem of a
firm's CEO estimating (non-regular) beliefs about a sequence of events, before
acting on them. Agents' observations are modeled as jointly distributed with
the underlying data through a given conditional probability density function.
We study the asymptotic behavior of the minimum achievable mean squared error
distortion at the CEO in the limit when the number of agents  and the sum
rate  tend to infinity. We establish a  convergence of the
distortion, an intermediate regime of performance between the exponential
behavior in discrete CEO problems [Berger, Zhang, and Viswanathan (1996)], and
the  behavior in Gaussian CEO problems [Viswanathan and Berger (1997)].
Achievability is proved by a layered architecture with scalar quantization,
distributed entropy coding, and midrange estimation. The converse is proved
using the Bayesian Chazan-Zakai-Ziv bound.Comment: 18 pages, 1 figur
- …
