1,355 research outputs found
Twitter bot detection using deep learning
Social media platforms have revolutionized how people interact with each other and how people gain information. However, social media platforms such as Twitter and Facebook quickly became the platform for public manipulation and spreading or amplifying political or ideological misinformation. Although malicious content can be shared by individuals, today millions of individual and coordinated automated accounts exist, also called bots which share hate, spread misinformation and manipulate public opinion without any human intervention. The work presented in this paper aims at designing and implementing deep learning approaches that successfully identify social media bots. Moreover we show that deep learning models can yield an accuracy of 0.9 on the PAN 2019 Bots and Gender Profiling dataset. In addition, the findings of this work also show that pre-trained models will be able to improve the accuracy of deep learning models and compete with Classical Machine Learning methods even on limited dataset
Bot and gender detection of twitter accounts using distortion and LSA notebook for PAN at CLEF 2019
In this work, we present our approach for the Author Profiling task of PAN 2019. The task is divided into two sub-problems, bot, and gender detection, for two different languages: English and Spanish. For each instance of the problem and each language, we address the problem differently. We use an ensemble architecture to solve the Bot Detection for accounts that write in English and a single SVM for those who write in Spanish. For the Gender detection we use a single SVM architecture for both the languages, but we pre-process the tweets in a different way. Our final models achieve accuracy over the 90% in the bot detection task, while for the gender detection, of 84.17% and 77.61% respectively for the English and Spanish languages
Overview of PAN 2021: Authorship Verification, Profiling Hate Speech Spreaders on Twitter, and Style Change Detection.
[EN] The paper gives a brief overview of the three shared tasks
to be organized at the PAN 2021 lab on digital text forensics and stylometry hosted at the CLEF conference. The tasks include authorship
verification across domains, author profiling for hate speech spreaders, and style change detection for multi-author documents. In part the tasks are new and in part they continue and advance past shared tasks, with the overall goal of advancing the state of the art, providing for an objective evaluation on newly developed benchmark datasets.The work of the researchers from Universitat Politecnica de
Valencia was partially funded by the Spanish MICINN under the project MISMISFAKEnHATE on MISinformation and MIScommunication in social media: FAKE news and HATE speech (PGC2018-096212-B-C31), and by the Generalitat Valenciana under the project DeepPattern (PROMETEO/2019/121).Bevendorff, J.; Chulvi-Ferriols, MA.; Peña-Sarracén, GLDL.; Kestemont, M.; Manjavacas, E.; Markov, I.; Mayerl, M.... (2021). Overview of PAN 2021: Authorship Verification, Profiling Hate Speech Spreaders on Twitter, and Style Change Detection. Springer. 567-573. https://doi.org/10.1007/978-3-030-72240-1_6656757
Overview of the 8th Author Profiling Task at PAN 2020: Profiling Fake News Spreaders on Twitter
[EN] This overview presents the Author Profiling shared task at
PAN 2020. The focus of this year's task is on determining whether or not
the author of a Twitter feed is keen to spread fake news. Two have been
the main aims: (i) to show the feasibility of automatically identifying
potential fake news spreaders in Twitter; and (ii) to show the difficulty
of identifying them when they do not limit themselves to just retweet
domain-specific news. For this purpose a corpus with Twitter data has
been provided, covering the English and Spanish languages. Altogether,
the approaches of 66 participants have been evaluated.First of all we thank the participants: 66 this year, record in terms of participants at PAN Lab since 2009! We have to thank also Martin Potthast, Matti
Wiegmann, and Nikolay Kolyada to help with the 66 Virtual Machines in the
TIRA platform. We thank Symanto for sponsoring the ex aequo award for the two best performing systems at the author profiling shared task of this year. The
work of Paolo Rosso was partially funded by the Spanish MICINN under the
research project MISMIS-FAKEnHATE on Misinformation and Miscommunication in social media: FAKE news and HATE speech (PGC2018-096212-B-C31).
The work of Anastasia Giachanou is supported by the SNSF Early Postdoc
Mobility grant under the project Early Fake News Detection on Social Media,
Switzerland (P2TIP2 181441).Rangel, F.; Giachanou, A.; Ghanem, BHH.; Rosso, P. (2020). Overview of the 8th Author Profiling Task at PAN 2020: Profiling Fake News Spreaders on Twitter. CEUR Workshop Proceedings. 2696:1-18. http://hdl.handle.net/10251/166528S118269
Profiling hate speech spreaders on twitter task at PAN 2021
[EN] This overview presents the Author Profiling shared task at PAN 2021. The focus of this year¿s task is on determining whether or not the author of a Twitter feed is keen to spread hate speech. The main aim is to show the feasibility of automatically identifying potential hate speech spreaders on Twitter. For this purpose a corpus with Twitter data has been provided, covering the English and Spanish languages. Altogether, the approaches of 66 participants have been evaluated.First of all, we thank the participants: again 66 this year, as the previous year on Profiling Fake
News Spreaders! We have to thank also Martin Potthast, Matti Wiegmann, Nikolay Kolyada, and
Magdalena Anna Wolska for their technical support with the TIRA platform. We thank Symanto
for sponsoring again the award for the best performing system at the author profiling shared
task. The work of Francisco Rangel was partially funded by the Centre for the Development
of Industrial Technology (CDTI) of the Spanish Ministry of Science and Innovation under the
research project IDI-20210776 on Proactive Profiling of Hate Speech Spreaders - PROHATER
(Perfilador Proactivo de Difusores de Mensajes de Odio). The work of the researchers from
Universitat Politècnica de València was partially funded by the Spanish MICINN under the
project MISMIS-FAKEnHATE on MISinformation and MIScommunication in social media: FAKE
news and HATE speech (PGC2018-096212-B-C31), and by the Generalitat Valenciana under
the project DeepPattern (PROMETEO/2019/121). This article is also based upon work from the
Dig-ForAsp COST Action 17124 on Digital Forensics: evidence analysis via intelligent systems
and practices, supported by European Cooperation in Science and Technology.Rangel, F.; Peña-Sarracén, GLDL.; Chulvi-Ferriols, MA.; Fersini, E.; Rosso, P. (2021). Profiling hate speech spreaders on twitter task at PAN 2021. CEUR. 1772-1789. http://hdl.handle.net/10251/1906631772178
An Exploratory Study of COVID-19 Misinformation on Twitter
During the COVID-19 pandemic, social media has become a home ground for
misinformation. To tackle this infodemic, scientific oversight, as well as a
better understanding by practitioners in crisis management, is needed. We have
conducted an exploratory study into the propagation, authors and content of
misinformation on Twitter around the topic of COVID-19 in order to gain early
insights. We have collected all tweets mentioned in the verdicts of
fact-checked claims related to COVID-19 by over 92 professional fact-checking
organisations between January and mid-July 2020 and share this corpus with the
community. This resulted in 1 500 tweets relating to 1 274 false and 276
partially false claims, respectively. Exploratory analysis of author accounts
revealed that the verified twitter handle(including Organisation/celebrity) are
also involved in either creating (new tweets) or spreading (retweet) the
misinformation. Additionally, we found that false claims propagate faster than
partially false claims. Compare to a background corpus of COVID-19 tweets,
tweets with misinformation are more often concerned with discrediting other
information on social media. Authors use less tentative language and appear to
be more driven by concerns of potential harm to others. Our results enable us
to suggest gaps in the current scientific coverage of the topic as well as
propose actions for authorities and social media users to counter
misinformation.Comment: 20 pages, nine figures, four tables. Submitted for peer review,
revision
Experimental Analysis of the Relevance of Features and Effects on Gender Classification Models for Social Media Author Profiling
[Abstract] Automatic user profiling from social networks has become a popular task due to its commercial applications
(targeted advertising, market studies...). Automatic profiling models infer demographic characteristics
of social network users from their generated content or interactions. Users’ demographic information is also
precious for more social worrying tasks such as automatic early detection of mental disorders. For this type
of users’ analysis tasks, it has been shown that the way how they use language is an important indicator which
contributes to the effectiveness of the models. Therefore, we also consider that for identifying aspects such as
gender, age or user’s origin, it is interesting to consider the use of the language both from psycho-linguistic
and semantic features. A good selection of features will be vital for the performance of retrieval, classification,
and decision-making software systems. In this paper, we will address gender classification as a part of the automatic
profiling task. We show an experimental analysis of the performance of existing gender classification
models based on external corpus and baselines for automatic profiling. We analyse in-depth the influence of
the linguistic features in the classification accuracy of the model. After that analysis, we have put together a
feature set for gender classification models in social networks with an accuracy performance above existing
baselines.This work was supported by projects RTI2018-093336-B-C21, RTI2018-093336-B-C22 (Ministerio de Ciencia e Innvovacion & ERDF) and the financial support supplied by the Conselleria de Educacion, Universidade e Formacion Profesional (accreditation 2019-2022 ED431G/01, ED431B 2019/03) and the European Regional Development Fund, which acknowledges the CITIC Research Center in ICT of the University of A Coruna as a Research Center of the Galician University System.Xunta de Galicia; ED431G/01Xunta de Galicia; ED431B 2019/0
- …