Search CORE

195 research outputs found

Classifying Tweet Level Judgements of Rumours in Social Media

Author: Bontcheva Kalina
Cohn Trevor
Lukasik Michal
Publication venue
Publication date: 01/01/2015
Field of study

Social media is a rich source of rumours and corresponding community reactions. Rumours reflect different characteristics, some shared and some individual. We formulate the problem of classifying tweet level judgements of rumours as a supervised learning task. Both supervised and unsupervised domain adaptation are considered, in which tweets from a rumour are classified on the basis of other annotated rumours. We demonstrate how multi-task learning helps achieve good results on rumours from the 2011 England riots

arXiv.org e-Print Archive

CiteSeerX

Crossref

Social Media and Information Overload: Survey Results

Author: Bontcheva Kalina
Gorrell Genevieve
Wessels Bridgette
Publication venue
Publication date: 01/01/2013
Field of study

A UK-based online questionnaire investigating aspects of usage of user-generated media (UGM), such as Facebook, LinkedIn and Twitter, attracted 587 participants. Results show a high degree of engagement with social networking media such as Facebook, and a significant engagement with other media such as professional media, microblogs and blogs. Participants who experience information overload are those who engage less frequently with the media, rather than those who have fewer posts to read. Professional users show different behaviours to social users. Microbloggers complain of information overload to the greatest extent. Two thirds of Twitter-users have felt that they receive too many posts, and over half of Twitter-users have felt the need for a tool to filter out the irrelevant posts. Generally speaking, participants express satisfaction with the media, though a significant minority express a range of concerns including information overload and privacy

arXiv.org e-Print Archive

Enlighten

Classifying COVID-19 vaccine narratives

Author: Bontcheva Kalina
Li Yue
Scarton Carolina
Song Xingyi
Publication venue
Publication date: 17/11/2023
Field of study

Vaccine hesitancy is widespread, despite the government's information campaigns and the efforts of the World Health Organisation (WHO). Categorising the topics within vaccine-related narratives is crucial to understand the concerns expressed in discussions and identify the specific issues that contribute to vaccine hesitancy. This paper addresses the need for monitoring and analysing vaccine narratives online by introducing a novel vaccine narrative classification task, which categorises COVID-19 vaccine claims into one of seven categories. Following a data augmentation approach, we first construct a novel dataset for this new classification task, focusing on the minority classes. We also make use of fact-checker annotated data. The paper also presents a neural vaccine narrative classifier that achieves an accuracy of 84% under cross-validation. The classifier is publicly available for researchers and journalists.Comment: In Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing, 202

arXiv.org e-Print Archive

Examining Temporal Bias in Abusive Language Detection

Author: Bontcheva Kalina
Jin Mali
Maynard Diana
Mu Yida
Publication venue
Publication date: 25/09/2023
Field of study

The use of abusive language online has become an increasingly pervasive problem that damages both individuals and society, with effects ranging from psychological harm right through to escalation to real-life violence and even death. Machine learning models have been developed to automatically detect abusive language, but these models can suffer from temporal bias, the phenomenon in which topics, language use or social norms change over time. This study aims to investigate the nature and impact of temporal bias in abusive language detection across various languages and explore mitigation methods. We evaluate the performance of models on abusive data sets from different time periods. Our results demonstrate that temporal bias is a significant challenge for abusive language detection, with models trained on historical data showing a significant drop in performance over time. We also present an extensive linguistic analysis of these abusive data sets from a diachronic perspective, aiming to explore the reasons for language evolution and performance decline. This study sheds light on the pervasive issue of temporal bias in abusive language detection across languages, offering crucial insights into language evolution and temporal bias mitigation

arXiv.org e-Print Archive