Search CORE

11 research outputs found

QMUL-SDS @ SardiStance: Leveraging Network Interactions to Boost Performance on Stance Detection using Knowledge Graphs

Author: Alkhalifa Rabab
Zubiaga Arkaitz
Publication venue
Publication date: 02/11/2020
Field of study

This paper presents our submission to the SardiStance 2020 shared task, describing the architecture used for Task A and Task B. While our submission for Task A did not exceed the baseline, retraining our model using all the training tweets, showed promising results leading to (f-avg 0.601) using bidirectional LSTM with BERT multilingual embedding for Task A. For our submission for Task B, we ranked 6th (f-avg 0.709). With further investigation, our best experimented settings increased performance from (f-avg 0.573) to (f-avg 0.733) with same architecture and parameter settings and after only incorporating social interaction features -- highlighting the impact of social interaction on the model's performance

arXiv.org e-Print Archive

Queen Mary Research Online

QMUL-SDS at CheckThat! 2020: Determining COVID-19 Tweet Check-Worthiness Using an Enhanced CT-BERT with Numeric Expressions

Author: Alkhalifa Rabab
Kochkina Elena
Liakata Maria
Yoong Theodore
Zubiaga Arkaitz
Publication venue
Publication date: 30/08/2020
Field of study

This paper describes the participation of the QMUL-SDS team for Task 1 of the CLEF 2020 CheckThat! shared task. The purpose of this task is to determine the check-worthiness of tweets about COVID-19 to identify and prioritise tweets that need fact-checking. The overarching aim is to further support ongoing efforts to protect the public from fake news and help people find reliable information. We describe and analyse the results of our submissions. We show that a CNN using COVID-Twitter-BERT (CT-BERT) enhanced with numeric expressions can effectively boost performance from baseline results. We also show results of training data augmentation with rumours on other topics. Our best system ranked fourth in the task with encouraging outcomes showing potential for improved results in the future

arXiv.org e-Print Archive

Queen Mary Research Online

Extended overview of the CLEF 2024 LongEval Lab on Longitudinal Evaluation of Model Performance

Author: Alkhalifa Rabab
Borkakoty Hsuvas
Deveaud Romain
El-Ebshihy Alaa
Espinosa-Anke Luis
Fink Tobias
Galuščáková Petra
Goeuriot Lorraine
Gonzalez-Saez Gabriela
Iommi David
Liakata Maria
Madabushi Harish Tayyar
Medina-Alias Pablo
Mulhem Philippe
Piroi Florina
Popel Martin
Zubiaga Arkaitz
Publication venue
Publication date: 12/09/2024
Field of study

We describe the second edition of the LongEval CLEF 2024 shared task. This lab evaluates the temporal persistence of Information Retrieval (IR) systems and Text Classifiers. Task 1 requires IR systems to run on corpora acquired at several timestamps, and evaluates the drop in system quality (NDCG) along these timestamps. Task 2 tackles binary sentiment classification at different points in time, and evaluates the performance drop for different temporal gaps. Overall, 37 teams registered for Task 1 and 25 for Task 2. Ultimately, 14 and 4 teams participated in Task 1 and Task 2, respectively.</p

OPUS

Extended overview of the CLEF 2024 LongEval Lab on Longitudinal Evaluation of Model Performance

Author: Alkhalifa Rabab
Borkakoty Hsuvas
Deveaud Romain
El-Ebshihy Alaa
Espinosa-Anke Luis
Fink Tobias
Galuščáková Petra
Goeuriot Lorraine
Gonzalez-Saez Gabriela
Iommi David
Liakata Maria
Madabushi Harish Tayyar
Medina-Alias Pablo
Mulhem Philippe
Piroi Florina
Popel Martin
Zubiaga Arkaitz
Publication venue
Publication date: 12/09/2024
Field of study

OPUS

LongEval: Longitudinal Evaluation of Model Performance at CLEF 2023

Author: Alkhalifa Rabab
Bilal Iman
Borkakoty Hsuvas
Camacho-Collados Jose
Deveaud Romain
El-Ebshihy Alaa
Espinosa-Anke Luis
Galuščáková Petra
Goeuriot Lorraine
Gonzalez-Saez Gabriela
Kochkina Elena
Liakata Maria
Loureiro Daniel
Mulhem Philippe
Piroi Florina
Popel Martin
Servan Christophe
Tayyar Madabushi Harish
Zubiaga Arkaitz
Publication venue
Publication date: 16/03/2023
Field of study

In this paper, we describe the plans for the first LongEval CLEF 2023 shared task dedicated to evaluating the temporal persistence of Information Retrieval (IR) systems and Text Classifiers. The task is motivated by recent research showing that the performance of these models drops as the test data becomes more distant, with respect to time, from the training data. LongEval differs from traditional shared IR and classification tasks by giving special consideration to evaluating models aiming to mitigate performance drop over time. We envisage that this task will draw attention from the IR community and NLP researchers to the problem of temporal persistence of models, what enables or prevents it, potential solutions and their limitations.</p

OPUS

Extended Overview of the CLEF-2023 LongEval Lab on Longitudinal Evaluation of Model Performance

Author: Alkhalifa Rabab
Bilal Iman
Borkakoty Hsuvas
Camacho-Collados Jose
Deveaud Romain
El-Ebshihy Alaa
Espinosa-Anke Luis
Galuščáková Petra
Goeuriot Lorraine
Gonzalez-Saez Gabriela
Kochkina Elena
Liakata Maria
Loureiro Daniel
Madabushi Harish Tayyar
Mulhem Philippe
Piroi Florina
Popel Martin
Servan Christophe
Zubiaga Arkaitz
Publication venue
Publication date: 21/09/2023
Field of study

OPUS

Extended Overview of the CLEF-2023 LongEval Lab on Longitudinal Evaluation of Model Performance

Author: Alkhalifa Rabab
Bilal Iman
Borkakoty Hsuvas
Camacho-Collados Jose
Deveaud Romain
El-Ebshihy Alaa
Espinosa-Anke Luis
Galuščáková Petra
Goeuriot Lorraine
Gonzalez-Saez Gabriela
Kochkina Elena
Liakata Maria
Loureiro Daniel
Madabushi Harish Tayyar
Mulhem Philippe
Piroi Florina
Popel Martin
Servan Christophe
Zubiaga Arkaitz
Publication venue
Publication date: 21/09/2023
Field of study

We describe the first edition of the LongEval CLEF 2023 shared task. This lab evaluates the temporal persistence of Information Retrieval (IR) systems and Text Classifiers. Task 1 requires IR systems to run on corpora acquired at several timestamps, and evaluates the drop in system quality (NDCG) along these timestamps. Task 2 tackles binary sentiment classification at different points in time, and evaluates the performance drop for different temporal gaps. Overall, 37 teams registered for Task 1 and 25 for Task 2. Ultimately, 14 and 4 teams participated in Task 1 and Task 2, respectively.</p

OPUS

EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020

Author: Agerri Rodrigo
Aliprandi Carlo
Alkhalifa Rabab
Alzetta Chiara
Angel Jason
Anselmi Guido
Appiah Balaji Nitin Nikamanth
Aroyehun Segun Taofeek
Artigas Herold Maria Fernanda
Attanasio Giuseppe
Attardi Giuseppe
Badryzlova Yulia
Bai Yang
Baldissin Gioia
Ballarè Silvia
Barrón-Cedeño Alberto
Bartle Anna-Sophie
Basile Pierpaolo
Basile Valerio
Basili Roberto
Belotti Federico
Bennici Mauro
Bharathi B.
Bhuvana J.
Bianchi Federico
Bisconti Elia
Bolanos Luis
Bondielli Alessandro
Bosco Cristina
Breazzano Claudia
Brivio Matteo
Brunato Dominique
Cafagna Michele
Caputo Annalina
Caselli Tommaso
Cassotti Pierluigi
Castañeda Enrique
Castro Castro Daniel
Centeno Roberto
Cercel Dumitru-Clementin
Cerruti Massimo
Chandrabose Aravindan
Chesi Cristiano
Chiarello Filippo
Cignarella Alessandra Teresa
Cimino Andrea
Comandini Gloria
Croce Danilo
Dai Hongbing
Dascalu Mihai
Dell’Orletta Felice
Delmonte Rodolfo
Deng Tao
De Francesco Nazareno
De Martino Graziella
De Mattei Lorenzo
Di Buccio Emanuele
Di Maro Maria
di Nuovo Elisa
Di Rosa Emanuele
dos S.R. da Silva Adriano
Durante Alberto
El Abassi Samer
Espinosa María S.
Fabrizi Samuel
Fantoni Gualtiero
Ferilli Stefano
Ferraccioli Federico
Fersini Elisabetta
Finos Livio
Fiorucci Stefano
Fontana Michele
Frenda Simona
Gambino Giuseppe
Gatt Albert
Gelbukh Alexander
Giorgi Giulia
Giorgioni Simone
Girardi Paolo
Goria Eugenio
Gregori Lorenzo
Hoffmann Julia
Iacono Maria
Iovine Andrea
Izzi Giovanni Luca
Jimenez Sergio
Kaiser Jens
Kayalvizhi S.
Kivlichan Ian
Klaus Svea
Koceva Frosina
Kovács György
Kruschwitz Udo
Labadie Tamayo Roberto
Lai Mirko
Laicher Severin
Lapesa Gabriella
Lavergne Eric
Lebani Gianluca E.
Lebani Gianluca E.
Lees Alyssa
Lenci Alessandro
Leonardelli Elisa
Li Hongling
Liakata Maria
Lovetere Marco
Madonna Domenico
Massidda Riccardo
Mattei Lorenzo De
Mauri Caterina
Mele Francesco
Melucci Massimo
Menini Stefano
Miaschi Alessio
Miliani Martina
Moggio Alessio
Montagnani Matteo
Montefinese Maria
Montemagni Simonetta
Monti Johanna
Moraca Maurizio
Moretti Giovanni
Morra Simone
Murphy Killian
Muti Arianna
Nakov Preslav
Nisioi Sergiu
Nissim Malvina
Nozza Debora
Occhipinti Daniela
Ortega Bueno Reynier
Ou Xiaozhi
Palmonari Matteo
Parizzi Andrea
Pascucci Antonio
Passaro Lucia C.
Pastor Eliana
Patti Viviana
Pirrone Roberto
Polignano Marco
Politi Marcello
Pont Mattia Da
Pražák Ondřej
Proisl Thomas
Puccetti Giovanni
Přibáň Pavel
Radicioni Daniele P.
Rama Ilir
Rambelli Giulia
Ravelli Andrea Amelio
Rodrigo Alvaro
Rodriguez-Diaz Carlos A.
Rodriguez Cisnero Mariano Jason
Roman Norton T.
Roman Norton Trevisan
Rossmann Daniela
Rosso Paolo
Rotaru Armand Stefan
Rubino Edoardo
Russo Irene
Sabella Gianluca
Saini Rajkumar
Salman Samir
Sangati Federico
Sanguinetti Manuela
Sarti Gabriele
Schlechtweg Dominik
Schulte im Walde Sabine
Sciandra Andrea
Setpal Jinen
Siciliani Lucia
Solari Dario
Sorensen Jeffrey
Sorgente Antonio
Sprugnoli Rachele
Stranisci Marco
Tamburini Fabio
Taylor Stephen
Tesei Andrea
Thenmozhi D.
Tonelli Sara
Torre Ilaria
Tsakalidis Adam
Varvara Rossella
Venturi Giulia
Vettigli Giuseppe
Vlad George-Alexandru
Wang Benyou
Zaharia George-Eduard
Zamparelli Roberto
Zubiaga Arkaitz
Publication venue: 'OpenEdition'
Publication date: 11/05/2021
Field of study

Welcome to EVALITA 2020! EVALITA is the evaluation campaign of Natural Language Processing and Speech Tools for Italian. EVALITA is an initiative of the Italian Association for Computational Linguistics (AILC, http://www.ai-lc.it) and it is endorsed by the Italian Association for Artificial Intelligence (AIxIA, http://www.aixia.it) and the Italian Association for Speech Sciences (AISV, http://www.aisv.it)

OpenEdition

LongEval: Longitudinal Evaluation of Model Performance at CLEF 2024

Author: Alkhalifa Rabab
Borkakoty Hsuvas
Deveaud Romain
El-Ebshihy Alaa
Espinosa-Anke Luis
Fink Tobias
Galuščáková Petra
Goeuriot Lorraine
Gonzalez-Saez Gabriela
Iommi David
Liakata Maria
Madabushi Harish Tayyar
Medina-Alias Pablo
Mulhem Philippe
Piroi Florina
Popel Martin
Servan Christophe
Zubiaga Arkaitz
Publication venue: Springer Nature Switzerland
Publication date: 24/03/2024
Field of study

International audienceThis paper introduces the planned second LongEval Lab, part of the CLEF 2024 conference. The aim of the lab's two tasks is to give researchers test data for addressing temporal effectiveness persistence challenges in both information retrieval and text classification, motivated by the fact that model performance degrades as the test data becomes temporally distant from the training data. LongEval distinguishes itself from traditional IR and classification tasks by emphasizing the evaluation of models designed to mitigate performance drop over time using evolving data. The second LongEval edition will further engage the IR community and NLP researchers in addressing the crucial challenge of temporal persistence in models, exploring the factors that enable or hinder it, and identifying potential solutions along with their limitations

HAL-CentraleSupelec

Hal - Université Grenoble Alpes

Overview of the CLEF-2023 LongEval Lab on Longitudinal Evaluation of Model Performance

Author: Alkhalifa Rabab
Bilal Iman
Borkakoty Hsuvas
Camacho-Collados Jose
Deveaud Romain
El-Ebshihy Alaa
Espinosa-Anke Luis
Galuščáková Petra
Goeuriot Lorraine
Gonzalez-Saez Gabriela
Kochkina Elena
Liakata Maria
Loureiro Daniel
Mulhem Philippe
Piroi Florina
Popel Martin
Servan Christophe
Tayyar Madabushi Harish
Zubiaga Arkaitz
Publication venue: Springer Nature Switzerland
Publication date: 01/09/2023
Field of study

International audienc

HAL-CentraleSupelec

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server