2 research outputs found

    SCUoL at CheckThat! 2021: An AraBERT model for check-worthiness of Arabic tweets

    Get PDF
    Many people nowadays tend to explore social media to obtain news and find information about various events and activities. However, an abundance of misleading and false information is spreading every day for many purposes, dramatically impacting societies. Therefore, it is vitally important to identify false information on social media to help individuals distinguish the truth and protect communities from the harmful effects of false information. For this reason, determining which information has the priority to be scrutinized is a significant prior step that several studies have considered. In this paper, we have addressed Subtask-1A(Arabic) of CLEF2021 CheckThat! Lab. We have done that in two steps. The first involved pre-processing the provided dataset with text segmentation and tokenization. In the second step, we implemented different models on the Arabic tweets in order to binary classify them according to whether a specific tweet is worth being considered for fact-checking or not. We mainly compared two versions of the pre-trained AraBERT model with some of the traditional word encoding methods, including the Linear SVC model with TF-IDF. The results indicate that the AraBERTv2 version outperforms the other models. Consequently, we used it for our final submission, and we were ranked third among eight other participating teams
    corecore