Search CORE

4 research outputs found

Reddit dataset for Adverse Drug Reaction

Author: Bozzon A. (Alessandro)
Houben G.J. (Geert-Jan)
Lofi C. (Christoph)
Mesbah S. (Sepideh)
Sips R.J. (Robert-Jan)
Valle Torre M. (Manuel)
Yang J. (Jie)
Publication venue: 4TU.Centre for Research Data
Publication date
Field of study

Reddit dataset for Adverse Drug Reaction (ADR) detection which was created with the help of expert annotators

Characterising and Mitigating Aggregation-Bias in Crowdsourced Toxicity Annotations

Author: Aroyo Lora
Balayn A.M.A.
Bozzon A.
Checco Alessandro
Demartini Gianluca
Dumitrache Anca
Gadiraju Ujwal
Mavridis P.
Paritosh Praveen
Quinn Alex
Sarasua Cristina
Szlávik Z.
Timmermans B.F.L.
Welty Chris
Publication venue: CEUR
Publication date: 01/01/2018
Field of study

Training machine learning (ML) models for natural language processing usually requires large amount of data, often acquired through crowdsourcing. The way this data is collected and aggregated can have an effect on the outputs of the trained model such as ignoring the labels which differ from the majority. In this paper we investigate how label aggregation can bias the ML results towards certain data samples and propose a methodology to highlight and mitigate this bias. Although our work is applicable to any kind of label aggregation for data subject to multiple interpretations, we focus on the effects of the bias introduced by majority voting on toxicity prediction over sentences. Our preliminary results point out that we can mitigate the majority-bias and get increased prediction accuracy for the minority opinions if we take into account the different labels from annotators when training adapted models, rather than rely on the aggregated labels

A Human in the Loop Approach to Capture Bias and Support Media Scientists in News Video Analysis

Author: Aroyo Lora
Aroyo Lora
Badenoch Alec
Bozzon A.
Checco Alessandro
de Jong M.
Demartini Gianluca
Dimitrova Antoaneta
Dumitrache Anca
Gadiraju Ujwal
Mavridis P.
Oomen Johan
Paritosh Praveen
Quinn Alex
Sarasua Cristina
Vos Jesse de
Welty Chris
Publication venue: CEUR
Publication date: 01/01/2018
Field of study

Bias is inevitable and inherent in any form of communication. News often appear biased to citizens with dierent political orientations, and understood dierently by news media scholars and the broader public. In this paper we advocate the need for accurate methods for bias identication in video news item, to enable rich analytics capabilities in order to assist humanities media scholars and social political scientists. We propose to analyze biases that are typical in video news (includingframing, gender and racial biases) by means of a human-in-the-loop approachthat combines text and image analysis with human computation techniques

Estimate Sentiment of Crowds from Social Media during City Events

Author: Aiken M.
Alessandro Bozzon
Baccianella S.
Blitzer J.
Cranshaw J.
Earl C.
Gilbert C. H. E.
Hasan S.
Hu M.
Jiang L.
Kouloumpis E.
Maas A. L.
Martin A.
Montoyo A.
Murphy K. P.
Nabil M.
Pang B.
Poria S.
Pﬁtzner R.
Quercia D.
Saif H.
Serge P. Hoogendoorn
Still G. K.
Thelwall M.
Vincent X. Gong
Winnie Daamen
Publication venue: 'SAGE Publications'
Publication date
Field of study

Crossref