Search CORE

17 research outputs found

Time-Sensitive Bayesian Information Aggregation for Crowdsourcing Systems

Author: Guiver John
Jennings Nick
Kohli Pushmeet
Venanzi Matteo
Publication venue
Publication date: 01/01/2016
Field of study

Crowdsourcing systems commonly face the problem of aggregating multiple judgments provided by potentially unreliable workers. In addition, several aspects of the design of efficient crowdsourcing processes, such as defining worker's bonuses, fair prices and time limits of the tasks, involve knowledge of the likely duration of the task at hand. Bringing this together, in this work we introduce a new time--sensitive Bayesian aggregation method that simultaneously estimates a task's duration and obtains reliable aggregations of crowdsourced judgments. Our method, called BCCTime, builds on the key insight that the time taken by a worker to perform a task is an important indicator of the likely quality of the produced judgment. To capture this, BCCTime uses latent variables to represent the uncertainty about the workers' completion time, the tasks' duration and the workers' accuracy. To relate the quality of a judgment to the time a worker spends on a task, our model assumes that each task is completed within a latent time window within which all workers with a propensity to genuinely attempt the labelling task (i.e., no spammers) are expected to submit their judgments. In contrast, workers with a lower propensity to valid labeling, such as spammers, bots or lazy labelers, are assumed to perform tasks considerably faster or slower than the time required by normal workers. Specifically, we use efficient message-passing Bayesian inference to learn approximate posterior probabilities of (i) the confusion matrix of each worker, (ii) the propensity to valid labeling of each worker, (iii) the unbiased duration of each task and (iv) the true label of each task. Using two real-world public datasets for entity linking tasks, we show that BCCTime produces up to 11% more accurate classifications and up to 100% more informative estimates of a task's duration compared to state-of-the-art methods

arXiv.org e-Print Archive

Southampton (e-Prints Soton)

Spiral - Imperial College Digital Repository

Learning from Crowds by Modeling Common Confusions

Author: Chu Zhendong
Ma Jing
Wang Hongning
Publication venue
Publication date: 18/05/2021
Field of study

Crowdsourcing provides a practical way to obtain large amounts of labeled data at a low cost. However, the annotation quality of annotators varies considerably, which imposes new challenges in learning a high-quality model from the crowdsourced annotations. In this work, we provide a new perspective to decompose annotation noise into common noise and individual noise and differentiate the source of confusion based on instance difficulty and annotator expertise on a per-instance-annotator basis. We realize this new crowdsourcing model by an end-to-end learning solution with two types of noise adaptation layers: one is shared across annotators to capture their commonly shared confusions, and the other one is pertaining to each annotator to realize individual confusion. To recognize the source of noise in each annotation, we use an auxiliary network to choose the two noise adaptation layers with respect to both instances and annotators. Extensive experiments on both synthesized and real-world benchmarks demonstrate the effectiveness of our proposed common noise adaptation solution.Comment: Accepted by AAAI 202

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Improving fairness in machine learning systems: What do industry practitioners need?

Author: ACM.
Agarwal Alekh
Attenberg Josh
Barocas Solon
Binns Reuben
Bolukbasi Tolga
Bosch Nigel
Buolamwini Joy
Chouldechova Alexandra
DSSG.
Green Ben
Kamar Ece
Kamar Ece
Kilbertus Niki
Kleinberg Jon
Kusner Matt J
Lakkaraju Himabindu
Liu Anqi
Liu Hugo
Liu Lydia T
Lyu Lingyu
Maclellan Christopher J
Nushi Besmira
Raghavan Manish
Sculley D.
Springer Aaron
Vaughan Jennifer Wortman
Yang Qian
Zhao Zian
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 07/01/2019
Field of study

The potential for machine learning (ML) systems to amplify social inequities and unfairness is receiving increasing popular and academic attention. A surge of recent work has focused on the development of algorithmic tools to assess and mitigate such unfairness. If these tools are to have a positive impact on industry practice, however, it is crucial that their design be informed by an understanding of real-world needs. Through 35 semi-structured interviews and an anonymous survey of 267 ML practitioners, we conduct the first systematic investigation of commercial product teams' challenges and needs for support in developing fairer ML systems. We identify areas of alignment and disconnect between the challenges faced by industry practitioners and solutions proposed in the fair ML research literature. Based on these findings, we highlight directions for future ML and HCI research that will better address industry practitioners' needs.Comment: To appear in the 2019 ACM CHI Conference on Human Factors in Computing Systems (CHI 2019

arXiv.org e-Print Archive

Crossref

Comparing Bayesian Models of Annotation

Author: Carpenter Bob
Chamberlain JD
Hovy Dirk
Kruschwitz Udo
Paun Silviu
Poesio Massimo
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2018
Field of study

The analysis of crowdsourced annotations in NLP is concerned with identifying 1) gold standard labels, 2) annotator accuracies and biases, and 3) item difficulties and error patterns. Traditionally, majority voting was used for 1), and coefficients of agreement for 2) and 3). Lately, model-based analysis of corpus annotations have proven better at all three tasks. But there has been relatively little work comparing them on the same datasets. This paper aims to fill this gap by analyzing six models of annotation, covering different approaches to annotator ability, item difficulty, and parameter pooling (tying) across annotators and items. We evaluate these models along four aspects: comparison to gold labels, predictive accuracy for new annotations, annotator characterization, and item difficulty, using four datasets with varying degrees of noise in the form of random (spammy) annotators. We conclude with guidelines for model selection, application, and implementation

University of Essex Research Repository

University of Regensburg Publication Server

Archivio istituzionale della Ricerca - Bocconi

Crossref

Queen Mary Research Online

Bias in data-driven artificial intelligence systems—An introductory survey

Author: Alani H.
Berendt B.
Broelemann K.
Fafalios P.
Fernandez M.
Gadiraju U.
Heinze C.
Iosifidis V.
Karimi F.
Kasneci G.
Kinder-Kurlanda K.
Kompatsiaris I.
Krasanakis E.
Kruegel T.
Nejdl W.
Ntoutsi E.
Papadopoulos S.
Ruggieri S.
Staab S.
Tiropanis T.
Turini F.
Vidal M. -E.
Wagner C.
Publication venue: 'Wiley'
Publication date: 01/01/2020
Field of study

Artificial Intelligence (AI)-based systems are widely employed nowadays to make decisions that have far-reaching impact on individuals and society. Their decisions might affect everyone, everywhere, and anytime, entailing concerns about potential human rights issues. Therefore, it is necessary to move beyond traditional AI algorithms optimized for predictive performance and embed ethical and legal principles in their design, training, and deployment to ensure social good while still benefiting from the huge potential of the AI technology. The goal of this survey is to provide a broad multidisciplinary overview of the area of bias in AI systems, focusing on technical challenges and solutions as well as to suggest new research directions towards approaches well-grounded in a legal frame. In this survey, we focus on data-driven AI, as a large part of AI is powered nowadays by (big) data and powerful machine learning algorithms. If otherwise not specified, we use the general term bias to describe problems related to the gathering or processing of data that might result in prejudiced decisions on the bases of demographic features such as race, sex, and so forth. This article is categorized under: Commercial, Legal, and Ethical Issues > Fairness in Data Mining Commercial, Legal, and Ethical Issues > Ethical Considerations Commercial, Legal, and Ethical Issues > Legal Issues

Lirias

Archivio della Ricerca - Università di Pisa

Bias in data-driven artificial intelligence systems - An introductory survey

Artificial Intelligence (AI)‐based systems are widely employed nowadays to make decisions that have far‐reaching impact on individuals and society. Their decisions might affect everyone, everywhere, and anytime, entailing concerns about potential human rights issues. Therefore, it is necessary to move beyond traditional AI algorithms optimized for predictive performance and embed ethical and legal principles in their design, training, and deployment to ensure social good while still benefiting from the huge potential of the AI technology. The goal of this survey is to provide a broad multidisciplinary overview of the area of bias in AI systems, focusing on technical challenges and solutions as well as to suggest new research directions towards approaches well‐grounded in a legal frame. In this survey, we focus on data‐driven AI, as a large part of AI is powered nowadays by (big) data and powerful machine learning algorithms. If otherwise not specified, we use the general term bias to describe problems related to the gathering or processing of data that might result in prejudiced decisions on the bases of demographic features such as race, sex, and so forth

DepositOnce

Southampton (e-Prints Soton)

ZENODO

Open Research Online (The Open University)

Repositorium für Naturwissenschaften und Technik

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Institutionelles Repositorium der Leibniz Universität Hannover

Recommended from our members

Bias in data-driven artificial intelligence systems - An introductory survey

Author: Alani H.
Berendt B.
Broelemann K.
Fafalios P.
Fernandez M.
Gadiraju U.
Heinze C.
Iosifidis V.
Karimi F.
Kasneci G.
Kinder-Kurlanda K.
Kompatsiaris I.
Krasanakis E.
Kruegel T.
Nejdl W.
Ntoutsi E.
Papadopoulos S.
Ruggieri S.
Staab S.
Tiropanis T.
Turini F.
Vidal Maria-Esther
Wagner C.
Publication venue: Hoboken, NJ : Wiley-Blackwell
Publication date: 01/01/2020
Field of study

Repositorium für Naturwissenschaften und Technik