1,355 research outputs found

    Twitter bot detection using deep learning

    Get PDF
    Social media platforms have revolutionized how people interact with each other and how people gain information. However, social media platforms such as Twitter and Facebook quickly became the platform for public manipulation and spreading or amplifying political or ideological misinformation. Although malicious content can be shared by individuals, today millions of individual and coordinated automated accounts exist, also called bots which share hate, spread misinformation and manipulate public opinion without any human intervention. The work presented in this paper aims at designing and implementing deep learning approaches that successfully identify social media bots. Moreover we show that deep learning models can yield an accuracy of 0.9 on the PAN 2019 Bots and Gender Profiling dataset. In addition, the findings of this work also show that pre-trained models will be able to improve the accuracy of deep learning models and compete with Classical Machine Learning methods even on limited dataset

    Bot and gender detection of twitter accounts using distortion and LSA notebook for PAN at CLEF 2019

    Get PDF
    In this work, we present our approach for the Author Profiling task of PAN 2019. The task is divided into two sub-problems, bot, and gender detection, for two different languages: English and Spanish. For each instance of the problem and each language, we address the problem differently. We use an ensemble architecture to solve the Bot Detection for accounts that write in English and a single SVM for those who write in Spanish. For the Gender detection we use a single SVM architecture for both the languages, but we pre-process the tweets in a different way. Our final models achieve accuracy over the 90% in the bot detection task, while for the gender detection, of 84.17% and 77.61% respectively for the English and Spanish languages

    Overview of PAN 2021: Authorship Verification, Profiling Hate Speech Spreaders on Twitter, and Style Change Detection.

    Full text link
    [EN] The paper gives a brief overview of the three shared tasks to be organized at the PAN 2021 lab on digital text forensics and stylometry hosted at the CLEF conference. The tasks include authorship verification across domains, author profiling for hate speech spreaders, and style change detection for multi-author documents. In part the tasks are new and in part they continue and advance past shared tasks, with the overall goal of advancing the state of the art, providing for an objective evaluation on newly developed benchmark datasets.The work of the researchers from Universitat Politecnica de Valencia was partially funded by the Spanish MICINN under the project MISMISFAKEnHATE on MISinformation and MIScommunication in social media: FAKE news and HATE speech (PGC2018-096212-B-C31), and by the Generalitat Valenciana under the project DeepPattern (PROMETEO/2019/121).Bevendorff, J.; Chulvi-Ferriols, MA.; Peña-Sarracén, GLDL.; Kestemont, M.; Manjavacas, E.; Markov, I.; Mayerl, M.... (2021). Overview of PAN 2021: Authorship Verification, Profiling Hate Speech Spreaders on Twitter, and Style Change Detection. Springer. 567-573. https://doi.org/10.1007/978-3-030-72240-1_6656757

    Overview of the 8th Author Profiling Task at PAN 2020: Profiling Fake News Spreaders on Twitter

    Full text link
    [EN] This overview presents the Author Profiling shared task at PAN 2020. The focus of this year's task is on determining whether or not the author of a Twitter feed is keen to spread fake news. Two have been the main aims: (i) to show the feasibility of automatically identifying potential fake news spreaders in Twitter; and (ii) to show the difficulty of identifying them when they do not limit themselves to just retweet domain-specific news. For this purpose a corpus with Twitter data has been provided, covering the English and Spanish languages. Altogether, the approaches of 66 participants have been evaluated.First of all we thank the participants: 66 this year, record in terms of participants at PAN Lab since 2009! We have to thank also Martin Potthast, Matti Wiegmann, and Nikolay Kolyada to help with the 66 Virtual Machines in the TIRA platform. We thank Symanto for sponsoring the ex aequo award for the two best performing systems at the author profiling shared task of this year. The work of Paolo Rosso was partially funded by the Spanish MICINN under the research project MISMIS-FAKEnHATE on Misinformation and Miscommunication in social media: FAKE news and HATE speech (PGC2018-096212-B-C31). The work of Anastasia Giachanou is supported by the SNSF Early Postdoc Mobility grant under the project Early Fake News Detection on Social Media, Switzerland (P2TIP2 181441).Rangel, F.; Giachanou, A.; Ghanem, BHH.; Rosso, P. (2020). Overview of the 8th Author Profiling Task at PAN 2020: Profiling Fake News Spreaders on Twitter. CEUR Workshop Proceedings. 2696:1-18. http://hdl.handle.net/10251/166528S118269

    Profiling hate speech spreaders on twitter task at PAN 2021

    Full text link
    [EN] This overview presents the Author Profiling shared task at PAN 2021. The focus of this year¿s task is on determining whether or not the author of a Twitter feed is keen to spread hate speech. The main aim is to show the feasibility of automatically identifying potential hate speech spreaders on Twitter. For this purpose a corpus with Twitter data has been provided, covering the English and Spanish languages. Altogether, the approaches of 66 participants have been evaluated.First of all, we thank the participants: again 66 this year, as the previous year on Profiling Fake News Spreaders! We have to thank also Martin Potthast, Matti Wiegmann, Nikolay Kolyada, and Magdalena Anna Wolska for their technical support with the TIRA platform. We thank Symanto for sponsoring again the award for the best performing system at the author profiling shared task. The work of Francisco Rangel was partially funded by the Centre for the Development of Industrial Technology (CDTI) of the Spanish Ministry of Science and Innovation under the research project IDI-20210776 on Proactive Profiling of Hate Speech Spreaders - PROHATER (Perfilador Proactivo de Difusores de Mensajes de Odio). The work of the researchers from Universitat Politècnica de València was partially funded by the Spanish MICINN under the project MISMIS-FAKEnHATE on MISinformation and MIScommunication in social media: FAKE news and HATE speech (PGC2018-096212-B-C31), and by the Generalitat Valenciana under the project DeepPattern (PROMETEO/2019/121). This article is also based upon work from the Dig-ForAsp COST Action 17124 on Digital Forensics: evidence analysis via intelligent systems and practices, supported by European Cooperation in Science and Technology.Rangel, F.; Peña-Sarracén, GLDL.; Chulvi-Ferriols, MA.; Fersini, E.; Rosso, P. (2021). Profiling hate speech spreaders on twitter task at PAN 2021. CEUR. 1772-1789. http://hdl.handle.net/10251/1906631772178

    An Exploratory Study of COVID-19 Misinformation on Twitter

    Get PDF
    During the COVID-19 pandemic, social media has become a home ground for misinformation. To tackle this infodemic, scientific oversight, as well as a better understanding by practitioners in crisis management, is needed. We have conducted an exploratory study into the propagation, authors and content of misinformation on Twitter around the topic of COVID-19 in order to gain early insights. We have collected all tweets mentioned in the verdicts of fact-checked claims related to COVID-19 by over 92 professional fact-checking organisations between January and mid-July 2020 and share this corpus with the community. This resulted in 1 500 tweets relating to 1 274 false and 276 partially false claims, respectively. Exploratory analysis of author accounts revealed that the verified twitter handle(including Organisation/celebrity) are also involved in either creating (new tweets) or spreading (retweet) the misinformation. Additionally, we found that false claims propagate faster than partially false claims. Compare to a background corpus of COVID-19 tweets, tweets with misinformation are more often concerned with discrediting other information on social media. Authors use less tentative language and appear to be more driven by concerns of potential harm to others. Our results enable us to suggest gaps in the current scientific coverage of the topic as well as propose actions for authorities and social media users to counter misinformation.Comment: 20 pages, nine figures, four tables. Submitted for peer review, revision

    Experimental Analysis of the Relevance of Features and Effects on Gender Classification Models for Social Media Author Profiling

    Get PDF
    [Abstract] Automatic user profiling from social networks has become a popular task due to its commercial applications (targeted advertising, market studies...). Automatic profiling models infer demographic characteristics of social network users from their generated content or interactions. Users’ demographic information is also precious for more social worrying tasks such as automatic early detection of mental disorders. For this type of users’ analysis tasks, it has been shown that the way how they use language is an important indicator which contributes to the effectiveness of the models. Therefore, we also consider that for identifying aspects such as gender, age or user’s origin, it is interesting to consider the use of the language both from psycho-linguistic and semantic features. A good selection of features will be vital for the performance of retrieval, classification, and decision-making software systems. In this paper, we will address gender classification as a part of the automatic profiling task. We show an experimental analysis of the performance of existing gender classification models based on external corpus and baselines for automatic profiling. We analyse in-depth the influence of the linguistic features in the classification accuracy of the model. After that analysis, we have put together a feature set for gender classification models in social networks with an accuracy performance above existing baselines.This work was supported by projects RTI2018-093336-B-C21, RTI2018-093336-B-C22 (Ministerio de Ciencia e Innvovacion & ERDF) and the financial support supplied by the Conselleria de Educacion, Universidade e Formacion Profesional (accreditation 2019-2022 ED431G/01, ED431B 2019/03) and the European Regional Development Fund, which acknowledges the CITIC Research Center in ICT of the University of A Coruna as a Research Center of the Galician University System.Xunta de Galicia; ED431G/01Xunta de Galicia; ED431B 2019/0
    corecore