Search CORE

113 research outputs found

On optimality of jury selection in crowdsourcing

Author: Cheng R
Maniu S
Mo L
Zheng Y
Publication venue: OpenProceedings.org.
Publication date: 01/01/2015
Field of study

Recent advances in crowdsourcing technologies enable computationally challenging tasks (e.g., sentiment analysis and entity resolution) to be performed by Internet workers, driven mainly by monetary incentives. A fundamental question is: how should workers be selected, so that the tasks in hand can be accomplished successfully and economically? In this paper, we study the Jury Selection Problem (JSP): Given a monetary budget, and a set of decision-making tasks (e.g., “Is Bill Gates still the CEO of Microsoft now?”), return the set of workers (called jury), such that their answers yield the highest “Jury Quality” (or JQ). Existing JSP solutions make use of the Majority Voting (MV) strategy, which uses the answer chosen by the largest number of workers. We show that MV does not yield the best solution for JSP. We further prove that among all voting strategies (including deterministic and randomized strategies), Bayesian Voting (BV) can optimally solve JSP. We then examine how to solve JSP based on BV. This is technically challenging, since computing the JQ with BV is NP-hard. We solve this problem by proposing an approximate algorithm that is computationally efficient. Our approximate JQ computation algorithm is also highly accurate, and its error is proved to be bounded within 1%. We extend our solution by considering the task owner’s “belief” (or prior) on the answers of the tasks. Experiments on synthetic and real datasets show that our new approach is consistently better than the best JSP solution known.published_or_final_versio

CiteSeerX

HKU Scholars Hub

Provably Secure Decisions based on Potentially Malicious Information

Author: Muller Tim
Sun Jun
Wang Dongxia
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 12/01/2024
Field of study

There are various security-critical decisions routinely made, on the basis of information provided by peers: routing messages, user reports, sensor data, navigational information, blockchain updates, etc. Jury theorems were proposed in sociology to make decisions based on information from peers, which assume peers may be mistaken with some probability. We focus on attackers in a system, which manifest as peers that strategically report fake information to manipulate decision making. We define the property of robustness: a lower bound probability of deciding correctly, regardless of what information attackers provide. When peers are independently selected, we propose an optimal, robust decision mechanism called Most Probable Realisation (MPR). When peer collusion affects source selection, we prove that generally it is NP-hard to find an optimal decision scheme. We propose multiple heuristic decision schemes that can achieve optimality for some collusion scenarios

Repository@Nottingham

On the evaluation and selection of classifier learning algorithms with crowdsourced data

Author: Calvo B.
Pérez A.
Urkullu A.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2019
Field of study

In many current problems, the actual class of the instances, the ground truth, is unavail- able. Instead, with the intention of learning a model, the labels can be crowdsourced by harvesting them from different annotators. In this work, among those problems we fo- cus on those that are binary classification problems. Specifically, our main objective is to explore the evaluation and selection of models through the quantitative assessment of the goodness of evaluation methods capable of dealing with this kind of context. That is a key task for the selection of evaluation methods capable of performing a sensible model selection. Regarding the evaluation and selection of models in such contexts, we identify three general approaches, each one based on a different interpretation of the nature of the underlying ground truth: deterministic, subjectivist or probabilistic. For the analysis of these three approaches, we propose how to estimate the Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) curve within each interpretation, thus deriving three evaluation methods. These methods are compared in extensive experimentation whose empirical results show that the probabilistic method generally overcomes the other two, as a result of which we conclude that it is advisable to use that method when performing the evaluation in such contexts. In further studies, it would be interesting to extend our research to multiclass classification problems

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

BCAM's Institutional Repository Data

Idea Contests: How to Design Feedback for an Efficient Contest

Author: Dargahi Rambod
Publication venue
Publication date: 30/11/2018
Field of study

Inviting the public or a targeted group of individuals to submit their ideas or solutions to a specific problem or challenge within a predefined period of time is called an “idea contest.” Idea contests are the straightforward mechanism to solicit and leverage the innovation and the intelligence of thousands of individuals. With the advent of the Internet, companies can easily organize idea contests with an easy access for anyone to participate from anywhere around the world. A contest organizer needs to design a contest so that more individuals are encouraged to participate, generate more innovative ideas/solutions, and to remain active throughout the contest. In my dissertation, I explore the effects of idea contest parameters –such as award size and structure, contest duration, the visibility of submissions, and the feedback- on the participation, motivation, and performance of individuals before and after joining a contest. Feedback, as the primary focus of my dissertation, is a less studied parameter in the context of idea contests. In my first essay, I investigate the relative importance of each contest design parameter, particularly feedback, with each other in motivating individuals to participate in a contest. In this regard, I both ran a conjoint study among real designers and collected online data from 99designs website. Feedback plays an important role in increasing the likelihood of participation and the participation rate for an idea contest. In the second essay, I explore the effect of two different types of feedback –absolute vs. relative- on the performance of participants during an idea contest. By running a real contest with participants from a major public university, I measured how participants in an idea contest react to different types of feedback. The likelihood of revising ideas as well as the quality of ideas submitted were the primary dependent variables in this field experiment.Marketing and Entrepreneurship, Department o

University of Houston Institutional Repository (UHIR)

DOCS: Domain-Aware Crowdsourcing System

Author: Cheng CK
Li G
ZHENG Y
Publication venue: 'VLDB Endowment'
Publication date: 01/01/2016
Field of study

published_or_final_versio

HKU Scholars Hub

Minimizing efforts in validating crowd answers

Author: Artikis A.
Goldberg D. E.
Gomes R. G.
Hung N. Q. V.
Jung H. J.
Kajino H.
Karger D. R.
Pasternack J.
Raykar V. C.
Russell S. J.
Surowiecki J.
TRAVAIL.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

In recent years, crowdsourcing has become essential in a wide range of Web applications. One of the biggest challenges of crowdsourcing is the quality of crowd answers as workers have wide-ranging levels of expertise and the worker community may contain faulty workers. Although various techniques for quality control have been proposed, a post-processing phase in which crowd answers are validated is still required. Validation is typically conducted by experts, whose availability is limited and who incur high costs. Therefore, we develop a probabilistic model that helps to identify the most beneficial validation questions in terms of both, improvement of result correctness and detection of faulty workers. Our approach allows us to guide the experts work by collecting input on the most problematic cases, thereby achieving a set of high quality answers even if the expert does not validate the complete answer set. Our comprehensive evaluation using both real-world and synthetic datasets demonstrates that our techniques save up to 50% of expert efforts compared to baseline methods when striving for perfect result correctness. In absolute terms, for most cases, we achieve close to perfect correctness after expert input has been sought for only 20% of the questions

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Spiral - Imperial College Digital Repository