Search CORE

306 research outputs found

Natural language processing in the era of large language models.

Author: Zubiaga A
Publication venue: Frontiers
Publication date: 12/01/2024
Field of study

Aggregating pairwise semantic differences for few-shot claim verification

Author: Zeng X
Zubiaga A
Publication venue
Publication date: 01/01/2022
Field of study

We introduce SEED, a novel vector-based method to few-shot claim verification that aggregates pairwise semantic differences for claim-evidence pairs

PubMed Central

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Queen Mary Research Online

Capturing stance dynamics in social media: open challenges and research directions

Author: Alkhalifa R
Zubiaga A
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/09/2021
Field of study

Social media platforms provide a goldmine for mining public opinion on issues of wide societal interest and impact. Opinion mining is a problem that can be operationalised by capturing and aggregating the stance of individual social media posts as supporting, opposing or being neutral towards the issue at hand. While most prior work in stance detection has investigated datasets that cover short periods of time, interest in investigating longitudinal datasets has recently increased. Evolving dynamics in linguistic and behavioural patterns observed in new data require adapting stance detection systems to deal with the changes. In this survey paper, we investigate the intersection between computational linguistics and the temporal evolution of human communication in digital media. We perform a critical review of emerging research considering dynamics, exploring different semantic and pragmatic factors that impact linguistic data in general, and stance in particular. We further discuss current directions in capturing stance dynamics in social media. We discuss the challenges encountered when dealing with stance dynamics, identify open challenges and discuss future directions in three key dimensions: utterance, context and influence

arXiv.org e-Print Archive

Queen Mary Research Online

Qmul-sds at exist: Leveraging pre-trained semantics and lexical features for multilingual sexism detection in social networks

Author: IberLEF
Jiang A
Zubiaga A
Publication venue
Publication date: 02/08/2021
Field of study

Online sexism is an increasing concern for those who experi- ence gender-based abuse in social media platforms as it has affected the healthy development of the Internet with negative impacts in society. The EXIST shared task proposes the first task on sEXism Identifica- tion in Social neTworks (EXIST) at IberLEF 2021 [30]. It provides a benchmark sexism dataset with Twitter and Gab posts in both English and Spanish, along with a task articulated in two subtasks consisting in sexism detection at different levels of granularity: Subtask 1 Sexism Iden- tification is a classical binary classification task to determine whether a given text is sexist or not, while Subtask 2 Sexism Categorisation is a finer-grained classification task focused on distinguishing different types of sexism. In this paper, we describe the participation of the QMUL-SDS team in EXIST. We propose an architecture made of the last 4 hidden states of XLM-RoBERTa and a TextCNN with 3 kernels. Our model also exploits lexical features relying on the use of new and existing lexicons of abusive words, with a special focus on sexist slurs and abusive words targeting women. Our team ranked 11th in Subtask 1 and 4th in Sub- task 2 among all the teams on the leaderboard, clearly outperforming the baselines offered by EXIST

Queen Mary Research Online

Few-Shot Learning for Cross-Target Stance Detection by Aggregating Multimodal Embeddings

Author: Khiabani PJ
Zubiaga A
Publication venue
Publication date: 01/01/2023
Field of study

Queen Mary Research Online

QMUL-SDS @ SardiStance2020: Leveraging network interactions to boost performance on stance detection using knowledge graphs

Author: Alkhalifa R
Zubiaga A
Publication venue: Accademia University Press
Publication date: 01/12/2020
Field of study

This paper presents our submission to the SardiStance 2020 shared task, describing the architecture used for Task A and Task B. While our submission for Task A did not exceed the baseline, retraining our model using all the training tweets, showed promising results leading to (f-avg 0.601) using bidirectional LSTM with BERT multilingual embedding for Task A. For our submission for Task B, we ranked 6th (f-avg 0.709). With further investigation, our best experimented settings increased performance from (f-avg 0.573) to (f-avg 0.733) with same architecture and parameter settings and after only incorporating social interaction features- highlighting the impact of social interaction on the model's performance

Queen Mary Research Online

Check-worthy claim detection across topics for automated fact-checking

Author: Abumansour AS
Zubiaga A
Publication venue
Publication date: 01/01/2023
Field of study

Queen Mary Research Online

QMUL-SDS at CheckThat! 2021: Enriching pre-trained language models for the estimation of check-worthiness of Arabic tweets

Author: Abumansour AS
CLEF
Zubiaga A
Publication venue
Publication date: 02/08/2021
Field of study

This paper describes our submission to the CheckThat! Lab at CLEF 2021, where we participated in Subtask 1A (check-worthy claim detection) in Arabic. We introduce our approach to estimate the checkworthiness of tweets as a ranking task. In our approach, we propose to fine-tune state-of-art transformer based models for Arabic such as AraBERTv0.2-base as well as to leverage additional training data from last year's shared task (CheckThat! Lab 2020) along with the dataset provided this year. According to the official evaluation, our submission obtained a joint 4th position in the competition where seven other groups participated

Queen Mary Research Online