On the use of the descriptive variable for enhancing the aggregation of crowdsourced labels

Beñaran-Muñoz, Iker; Hernández-González, Jerónimo; Pérez, Aritz

On the use of the descriptive variable for enhancing the aggregation of crowdsourced labels

Authors: Iker Beñaran-Muñoz
Jerónimo Hernández-González
Aritz Pérez
Publication date: 1 January 2022
Publisher: 'Springer Science and Business Media LLC'
Doi

Abstract

The use of crowdsourcing for annotating data has become a popular and cheap alternative to expert labelling. As a consequence, an aggregation task is required to combine the different labels provided and agree on a single one per example. Most aggregation techniques, including the simple and robust majority voting¿to select the label with the largest number of votes¿disregard the descriptive information provided by the explanatory variable. In this paper, we propose domain-aware voting, an extension of majority voting which incorporates the descriptive variable and the rest of the instances of the dataset for aggregating the label of every instance. The experimental results with simulated and real-world crowdsourced data suggest that domain-aware voting is a competitive alternative to majority voting, especially when a part of the dataset is unlabelled. We elaborate on practical criteria for the use of domain-aware voting

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

BCAM's Institutional Repository Data

oai:bird.bcamath.org:20.500.11...

Last time updated on 15/11/2023

Diposit Digital de la Universitat de Barcelona

oai:diposit.ub.edu:2445/189541

Last time updated on 06/10/2022