Search CORE

3,480 research outputs found

EveTAR: Building a Large-Scale Multi-Task Test Collection over Arabic Tweets

Author: A Bruns
AM Azmi
BS Wasike
D Bodoff
D Elsweiler
Hind Almerekhi
J Benhardus
JL Fleiss
JR Landis
K Darwish
M Efron
M Rowe
M Sanderson
Maram Hasanain
Mucahid Kutlu
Reem Suwaileh
RL Brennan
Tamer Elsayed
W Magdy
Zhang Y
Publication venue
Publication date: 21/08/2017
Field of study

This article introduces a new language-independent approach for creating a large-scale high-quality test collection of tweets that supports multiple information retrieval (IR) tasks without running a shared-task campaign. The adopted approach (demonstrated over Arabic tweets) designs the collection around significant (i.e., popular) events, which enables the development of topics that represent frequent information needs of Twitter users for which rich content exists. That inherently facilitates the support of multiple tasks that generally revolve around events, namely event detection, ad-hoc search, timeline generation, and real-time summarization. The key highlights of the approach include diversifying the judgment pool via interactive search and multiple manually-crafted queries per topic, collecting high-quality annotations via crowd-workers for relevancy and in-house annotators for novelty, filtering out low-agreement topics and inaccessible tweets, and providing multiple subsets of the collection for better availability. Applying our methodology on Arabic tweets resulted in EveTAR , the first freely-available tweet test collection for multiple IR tasks. EveTAR includes a crawl of 355M Arabic tweets and covers 50 significant events for which about 62K tweets were judged with substantial average inter-annotator agreement (Kappa value of 0.71). We demonstrate the usability of EveTAR by evaluating existing algorithms in the respective tasks. Results indicate that the new collection can support reliable ranking of IR systems that is comparable to similar TREC collections, while providing strong baseline results for future studies over Arabic tweets

arXiv.org e-Print Archive

News’ Credibility Detection on Social Media Using Machine Learning Algorithms

Author: AbdelMawgoud Sayed
Idrees Amira M., AMI
Yasser Farah
Publication venue: Arab Journals Platform
Publication date: 18/07/2023
Field of study

Social media is essential in many aspects of our lives. Social media allows us to find news for free. anyone can access it easily at any time. However, social media may also facilitate the rapid spread of misleading news. As a result, there is a probability that low-quality news, including incorrect and fake information, will spread over social media. As well as detecting news credibility on social media becomes essential because fake news can affect society negatively, and the spread of false news has a considerable impact on personal reputation and public trust. In this research, we conducted a model that detects the credibility of Arabic news from social media; particularly Arabic tweets. The content of the tweets revolves around the COVID-19 pandemic. The proposed model applied to detect news credibility using text mining techniques and one of the well-known machine learning classifiers, Decision tree which has the best accuracy equal to 86.6

TA-COS 2018 : 2nd Workshop on Text Analytics for Cybersecurity and Online Safety : Proceedings

Author: De Pauw Guy
Desmet Bart
Lefever Els
Publication venue: European Language Resources Association (ELRA)
Publication date: 01/01/2018
Field of study

Critical Impact of Social Networks Infodemic on Defeating Coronavirus COVID-19 Pandemic: Twitter-Based Study and Research Directions

Author: Arafeh Mohamad
Harmanani Haidar
Jenainatiy Cathia
Mourad Azzam
Srour Ali
Publication venue
Publication date: 18/05/2020
Field of study

News creation and consumption has been changing since the advent of social media. An estimated 2.95 billion people in 2019 used social media worldwide. The widespread of the Coronavirus COVID-19 resulted with a tsunami of social media. Most platforms were used to transmit relevant news, guidelines and precautions to people. According to WHO, uncontrolled conspiracy theories and propaganda are spreading faster than the COVID-19 pandemic itself, creating an infodemic and thus causing psychological panic, misleading medical advises, and economic disruption. Accordingly, discussions have been initiated with the objective of moderating all COVID-19 communications, except those initiated from trusted sources such as the WHO and authorized governmental entities. This paper presents a large-scale study based on data mined from Twitter. Extensive analysis has been performed on approximately one million COVID-19 related tweets collected over a period of two months. Furthermore, the profiles of 288,000 users were analyzed including unique users profiles, meta-data and tweets context. The study noted various interesting conclusions including the critical impact of the (1) exploitation of the COVID-19 crisis to redirect readers to irrelevant topics and (2) widespread of unauthentic medical precautions and information. Further data analysis revealed the importance of using social networks in a global pandemic crisis by relying on credible users with variety of occupations, content developers and influencers in specific fields. In this context, several insights and findings have been provided while elaborating computing and non-computing implications and research directions for potential solutions and social networks management strategies during crisis periods.Comment: 11 pages, 10 figures, Journal Articl

arXiv.org e-Print Archive

The use of Facebook as a source of news in post-revolutionary Egypt

Author: Shalaby Sondos Asem
Publication venue: AUC Knowledge Fountain
Publication date: 26/11/2021
Field of study

AUC Knowledge Fountain (American Univ. in Cairo)

A Model to Measure the Spread Power of Rumors

Author: Asgari-Chenaghlu Meysam
Balafar Mohammad-Ali
Feizi-Derakhshi Ali-Reza
Feizi-Derakhshi Mohammad-Reza
Jahanbakhsh-Nagadeh Zoleikha
Nikzad-Khasmakhi Narjes
Rahkar-Farshi Taymaz
Ramezani Majid
Ranjbar-Khadivi Mehrdad
Zafarani-Moattar Elnaz
Publication venue
Publication date: 18/11/2020
Field of study

Nowadays, a significant portion of daily interacted posts in social media are infected by rumors. This study investigates the problem of rumor analysis in different areas from other researches. It tackles the unaddressed problem related to calculating the Spread Power of Rumor (SPR) for the first time and seeks to examine the spread power as the function of multi-contextual features. For this purpose, the theory of Allport and Postman will be adopted. In which it claims that there are two key factors determinant to the spread power of rumors, namely importance and ambiguity. The proposed Rumor Spread Power Measurement Model (RSPMM) computes SPR by utilizing a textual-based approach, which entails contextual features to compute the spread power of the rumors in two categories: False Rumor (FR) and True Rumor (TR). Totally 51 contextual features are introduced to measure SPR and their impact on classification are investigated, then 42 features in two categories "importance" (28 features) and "ambiguity" (14 features) are selected to compute SPR. The proposed RSPMM is verified on two labelled datasets, which are collected from Twitter and Telegram. The results show that (i) the proposed new features are effective and efficient to discriminate between FRs and TRs. (ii) the proposed RSPMM approach focused only on contextual features while existing techniques are based on Structure and Content features, but RSPMM achieves considerably outstanding results (F-measure=83%). (iii) The result of T-Test shows that SPR criteria can significantly distinguish between FR and TR, besides it can be useful as a new method to verify the trueness of rumors

arXiv.org e-Print Archive