306 research outputs found

    Natural language processing in the era of large language models.

    Get PDF

    Aggregating pairwise semantic differences for few-shot claim verification

    Get PDF
    We introduce SEED, a novel vector-based method to few-shot claim verification that aggregates pairwise semantic differences for claim-evidence pairs

    Capturing stance dynamics in social media: open challenges and research directions

    Get PDF
    Social media platforms provide a goldmine for mining public opinion on issues of wide societal interest and impact. Opinion mining is a problem that can be operationalised by capturing and aggregating the stance of individual social media posts as supporting, opposing or being neutral towards the issue at hand. While most prior work in stance detection has investigated datasets that cover short periods of time, interest in investigating longitudinal datasets has recently increased. Evolving dynamics in linguistic and behavioural patterns observed in new data require adapting stance detection systems to deal with the changes. In this survey paper, we investigate the intersection between computational linguistics and the temporal evolution of human communication in digital media. We perform a critical review of emerging research considering dynamics, exploring different semantic and pragmatic factors that impact linguistic data in general, and stance in particular. We further discuss current directions in capturing stance dynamics in social media. We discuss the challenges encountered when dealing with stance dynamics, identify open challenges and discuss future directions in three key dimensions: utterance, context and influence

    Qmul-sds at exist: Leveraging pre-trained semantics and lexical features for multilingual sexism detection in social networks

    Get PDF
    Online sexism is an increasing concern for those who experi- ence gender-based abuse in social media platforms as it has affected the healthy development of the Internet with negative impacts in society. The EXIST shared task proposes the first task on sEXism Identifica- tion in Social neTworks (EXIST) at IberLEF 2021 [30]. It provides a benchmark sexism dataset with Twitter and Gab posts in both English and Spanish, along with a task articulated in two subtasks consisting in sexism detection at different levels of granularity: Subtask 1 Sexism Iden- tification is a classical binary classification task to determine whether a given text is sexist or not, while Subtask 2 Sexism Categorisation is a finer-grained classification task focused on distinguishing different types of sexism. In this paper, we describe the participation of the QMUL-SDS team in EXIST. We propose an architecture made of the last 4 hidden states of XLM-RoBERTa and a TextCNN with 3 kernels. Our model also exploits lexical features relying on the use of new and existing lexicons of abusive words, with a special focus on sexist slurs and abusive words targeting women. Our team ranked 11th in Subtask 1 and 4th in Sub- task 2 among all the teams on the leaderboard, clearly outperforming the baselines offered by EXIST

    QMUL-SDS @ SardiStance2020: Leveraging network interactions to boost performance on stance detection using knowledge graphs

    Get PDF
    This paper presents our submission to the SardiStance 2020 shared task, describing the architecture used for Task A and Task B. While our submission for Task A did not exceed the baseline, retraining our model using all the training tweets, showed promising results leading to (f-avg 0.601) using bidirectional LSTM with BERT multilingual embedding for Task A. For our submission for Task B, we ranked 6th (f-avg 0.709). With further investigation, our best experimented settings increased performance from (f-avg 0.573) to (f-avg 0.733) with same architecture and parameter settings and after only incorporating social interaction features- highlighting the impact of social interaction on the model's performance

    QMUL-SDS at CheckThat! 2021: Enriching pre-trained language models for the estimation of check-worthiness of Arabic tweets

    Get PDF
    This paper describes our submission to the CheckThat! Lab at CLEF 2021, where we participated in Subtask 1A (check-worthy claim detection) in Arabic. We introduce our approach to estimate the checkworthiness of tweets as a ranking task. In our approach, we propose to fine-tune state-of-art transformer based models for Arabic such as AraBERTv0.2-base as well as to leverage additional training data from last year's shared task (CheckThat! Lab 2020) along with the dataset provided this year. According to the official evaluation, our submission obtained a joint 4th position in the competition where seven other groups participated
    • …
    corecore