29 research outputs found

    The Impact of Psycholinguistic Patterns in Discriminating between Fake News Spreaders and Fact Checkers

    Full text link
    [EN] Fake news is a threat to society. A huge amount of fake news is posted every day on social networks which is read, believed and sometimes shared by a number of users. On the other hand, with the aim to raise awareness, some users share posts that debunk fake news by using information from fact-checking websites. In this paper, we are interested in exploring the role of various psycholinguistic characteristics in differentiating between users that tend to share fake news and users that tend to debunk them. Psycholinguistic characteristics represent the different linguistic information that can be used to profile users and can be extracted or inferred from users¿ posts. We present the CheckerOrSpreader model that uses a Convolution Neural Network (CNN) to differentiate between spreaders and checkers of fake news. The experimental results showed that CheckerOrSpreader is effective in classifying a user as a potential spreader or checker. Our analysis showed that checkers tend to use more positive language and a higher number of terms that show causality compared to spreaders who tend to use a higher amount of informal language, including slang and swear words.The works of Anastasia Giachanou and Daniel Oberski were funded by the Dutch Research Council (grant VI.Vidi.195.152). The work of Paolo Rosso was in the framework of the XAI-DisInfodemics project on eXplainable AI for disinformation and conspiracy detection during infodemics (PLEC2021-007681), funded by the Spanish Ministry of Science and Innovation, as well as IBERIFIER, the Iberian Digital Media Research and Fact-Checking Hub funded by the European Digital Media Observatory (2020-EU-IA0252).Giachanou, A.; Ghanem, BHH.; Rissola, EA.; Rosso, P.; Crestani, F.; Oberski, D. (2022). The Impact of Psycholinguistic Patterns in Discriminating between Fake News Spreaders and Fact Checkers. Data & Knowledge Engineering. 138:1-15. https://doi.org/10.1016/j.datak.2021.10196011513

    Overview of the 8th Author Profiling Task at PAN 2020: Profiling Fake News Spreaders on Twitter

    Full text link
    [EN] This overview presents the Author Profiling shared task at PAN 2020. The focus of this year's task is on determining whether or not the author of a Twitter feed is keen to spread fake news. Two have been the main aims: (i) to show the feasibility of automatically identifying potential fake news spreaders in Twitter; and (ii) to show the difficulty of identifying them when they do not limit themselves to just retweet domain-specific news. For this purpose a corpus with Twitter data has been provided, covering the English and Spanish languages. Altogether, the approaches of 66 participants have been evaluated.First of all we thank the participants: 66 this year, record in terms of participants at PAN Lab since 2009! We have to thank also Martin Potthast, Matti Wiegmann, and Nikolay Kolyada to help with the 66 Virtual Machines in the TIRA platform. We thank Symanto for sponsoring the ex aequo award for the two best performing systems at the author profiling shared task of this year. The work of Paolo Rosso was partially funded by the Spanish MICINN under the research project MISMIS-FAKEnHATE on Misinformation and Miscommunication in social media: FAKE news and HATE speech (PGC2018-096212-B-C31). The work of Anastasia Giachanou is supported by the SNSF Early Postdoc Mobility grant under the project Early Fake News Detection on Social Media, Switzerland (P2TIP2 181441).Rangel, F.; Giachanou, A.; Ghanem, BHH.; Rosso, P. (2020). Overview of the 8th Author Profiling Task at PAN 2020: Profiling Fake News Spreaders on Twitter. CEUR Workshop Proceedings. 2696:1-18. http://hdl.handle.net/10251/166528S118269

    On the Detection of False Information: From Rumors to Fake News

    Full text link
    Tesis por compendio[ES] En tiempos recientes, el desarrollo de las redes sociales y de las agencias de noticias han traído nuevos retos y amenazas a la web. Estas amenazas han llamado la atención de la comunidad investigadora en Procesamiento del Lenguaje Natural (PLN) ya que están contaminando las plataformas de redes sociales. Un ejemplo de amenaza serían las noticias falsas, en las que los usuarios difunden y comparten información falsa, inexacta o engañosa. La información falsa no se limita a la información verificable, sino que también incluye información que se utiliza con fines nocivos. Además, uno de los desafíos a los que se enfrentan los investigadores es la gran cantidad de usuarios en las plataformas de redes sociales, donde detectar a los difusores de información falsa no es tarea fácil. Los trabajos previos que se han propuesto para limitar o estudiar el tema de la detección de información falsa se han centrado en comprender el lenguaje de la información falsa desde una perspectiva lingüística. En el caso de información verificable, estos enfoques se han propuesto en un entorno monolingüe. Además, apenas se ha investigado la detección de las fuentes o los difusores de información falsa en las redes sociales. En esta tesis estudiamos la información falsa desde varias perspectivas. En primer lugar, dado que los trabajos anteriores se centraron en el estudio de la información falsa en un entorno monolingüe, en esta tesis estudiamos la información falsa en un entorno multilingüe. Proponemos diferentes enfoques multilingües y los comparamos con un conjunto de baselines monolingües. Además, proporcionamos estudios sistemáticos para los resultados de la evaluación de nuestros enfoques para una mejor comprensión. En segundo lugar, hemos notado que el papel de la información afectiva no se ha investigado en profundidad. Por lo tanto, la segunda parte de nuestro trabajo de investigación estudia el papel de la información afectiva en la información falsa y muestra cómo los autores de contenido falso la emplean para manipular al lector. Aquí, investigamos varios tipos de información falsa para comprender la correlación entre la información afectiva y cada tipo (Propaganda, Trucos / Engaños, Clickbait y Sátira). Por último, aunque no menos importante, en un intento de limitar su propagación, también abordamos el problema de los difusores de información falsa en las redes sociales. En esta dirección de la investigación, nos enfocamos en explotar varias características basadas en texto extraídas de los mensajes de perfiles en línea de tales difusores. Estudiamos diferentes conjuntos de características que pueden tener el potencial de ayudar a discriminar entre difusores de información falsa y verificadores de hechos.[CA] En temps recents, el desenvolupament de les xarxes socials i de les agències de notícies han portat nous reptes i amenaces a la web. Aquestes amenaces han cridat l'atenció de la comunitat investigadora en Processament de Llenguatge Natural (PLN) ja que estan contaminant les plataformes de xarxes socials. Un exemple d'amenaça serien les notícies falses, en què els usuaris difonen i comparteixen informació falsa, inexacta o enganyosa. La informació falsa no es limita a la informació verificable, sinó que també inclou informació que s'utilitza amb fins nocius. A més, un dels desafiaments als quals s'enfronten els investigadors és la gran quantitat d'usuaris en les plataformes de xarxes socials, on detectar els difusors d'informació falsa no és tasca fàcil. Els treballs previs que s'han proposat per limitar o estudiar el tema de la detecció d'informació falsa s'han centrat en comprendre el llenguatge de la informació falsa des d'una perspectiva lingüística. En el cas d'informació verificable, aquests enfocaments s'han proposat en un entorn monolingüe. A més, gairebé no s'ha investigat la detecció de les fonts o els difusors d'informació falsa a les xarxes socials. En aquesta tesi estudiem la informació falsa des de diverses perspectives. En primer lloc, atès que els treballs anteriors es van centrar en l'estudi de la informació falsa en un entorn monolingüe, en aquesta tesi estudiem la informació falsa en un entorn multilingüe. Proposem diferents enfocaments multilingües i els comparem amb un conjunt de baselines monolingües. A més, proporcionem estudis sistemàtics per als resultats de l'avaluació dels nostres enfocaments per a una millor comprensió. En segon lloc, hem notat que el paper de la informació afectiva no s'ha investigat en profunditat. Per tant, la segona part del nostre treball de recerca estudia el paper de la informació afectiva en la informació falsa i mostra com els autors de contingut fals l'empren per manipular el lector. Aquí, investiguem diversos tipus d'informació falsa per comprendre la correlació entre la informació afectiva i cada tipus (Propaganda, Trucs / Enganys, Clickbait i Sàtira). Finalment, però no menys important, en un intent de limitar la seva propagació, també abordem el problema dels difusors d'informació falsa a les xarxes socials. En aquesta direcció de la investigació, ens enfoquem en explotar diverses característiques basades en text extretes dels missatges de perfils en línia de tals difusors. Estudiem diferents conjunts de característiques que poden tenir el potencial d'ajudar a discriminar entre difusors d'informació falsa i verificadors de fets.[EN] In the recent years, the development of social media and online news agencies has brought several challenges and threats to the Web. These threats have taken the attention of the Natural Language Processing (NLP) research community as they are polluting the online social media platforms. One of the examples of these threats is false information, in which false, inaccurate, or deceptive information is spread and shared by online users. False information is not limited to verifiable information, but it also involves information that is used for harmful purposes. Also, one of the challenges that researchers have to face is the massive number of users in social media platforms, where detecting false information spreaders is not an easy job. Previous work that has been proposed for limiting or studying the issue of detecting false information has focused on understanding the language of false information from a linguistic perspective. In the case of verifiable information, approaches have been proposed in a monolingual setting. Moreover, detecting the sources or the spreaders of false information in social media has not been investigated much. In this thesis we study false information from several aspects. First, since previous work focused on studying false information in a monolingual setting, in this thesis we study false information in a cross-lingual one. We propose different cross-lingual approaches and we compare them to a set of monolingual baselines. Also, we provide systematic studies for the evaluation results of our approaches for better understanding. Second, we noticed that the role of affective information was not investigated in depth. Therefore, the second part of our research work studies the role of the affective information in false information and shows how the authors of false content use it to manipulate the reader. Here, we investigate several types of false information to understand the correlation between affective information and each type (Propaganda, Hoax, Clickbait, Rumor, and Satire). Last but not least, in an attempt to limit its spread, we also address the problem of detecting false information spreaders in social media. In this research direction, we focus on exploiting several text-based features extracted from the online profile messages of those spreaders. We study different feature sets that can have the potential to help to identify false information spreaders from fact checkers.Ghanem, BHH. (2020). On the Detection of False Information: From Rumors to Fake News [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/158570TESISCompendi

    An Analysis of People’s Reasoning for Sharing Real and Fake News

    Get PDF
    The problem of the increase in the volume of fake news and its widespread over social media has gained massive attention as most of the population seeks social media for daily news diet. Humans are equally responsible for the surge of fake news spread. Thus, it is imperative to understand people’s behavior when they decide to share real and fake news items on social media. In an attempt to do so, we performed an analysis on data collected through a survey where participants (n= 363) were asked whether they were willing to share the given news item on their social media and explain the reasoning for their decision. The results show that the analysis presents several commonalities with previous studies. Moreover, we also addressed the problem of predicting whether a person will share a given news item or not. For this, we used intrinsic features from participants’ open-ended responses and demographics attributes. We found that the perceived emotions triggered by the news item show a strong influence on the user’s decision to share news items on social media

    Simulating the Impact of Personality on Fake News

    Get PDF
    Fake news is a key issue for social networks. We use an agent-based network simulation to model the spread of (fake) news. The agents' behaviour captures the OCEAN ("big five") personality trait model. The network is homophilic for political preference, analytical thinking and emotion. We studied the system with personality traits and homophily each turned on/off. Personality traits and homophily exhibited a statistically significant but typically minimal impact. Ignoring personality traits when modelling fact-checking can overestimate its effectiveness

    Psychographic Traits Identification based on political ideology: An author analysis study on spanish politicians tweets posted in 2020

    Get PDF
    In general, people are usually more reluctant to follow advice and directions from politicians who do not have their ideology. In extreme cases, people can be heavily biased in favour of a political party at the same time that they are in sharp disagreement with others, which may lead to irrational decision making and can put people’s lives at risk by ignoring certain recommendations from the authorities. Therefore, considering political ideology as a psychographic trait can improve political micro-targeting by helping public authorities and local governments to adopt better communication policies during crises. In this work, we explore the reliability of determining psychographic traits concerning political ideology. Our contribution is twofold. On the one hand, we release the PoliCorpus-2020, a dataset composed by Spanish politicians’ tweets posted in 2020. On the other hand, we conduct two authorship analysis tasks with the aforementioned dataset: an author profiling task to extract demographic and psychographic traits, and an authorship attribution task to determine the author of an anonymous text in the political domain. Both experiments are evaluated with several neural network architectures grounded on explainable linguistic features, statistical features, and state-of-the-art transformers. In addition, we test whether the neural network models can be transferred to detect the political ideology of citizens. Our results indicate that the linguistic features are good indicators for identifying finegrained political affiliation, they boost the performance of neural network models when combined with embedding-based features, and they preserve relevant information when the models are tested with ordinary citizens. Besides, we found that lexical and morphosyntactic features are more effective on author profiling, whereas stylometric features are more effective in authorship attribution.publishedVersio

    Overview of PAN 2020: Authorship Verification, Celebrity Profiling, Profiling Fake News Spreaders on Twitter, and Style Change Detection

    Full text link
    [EN] We briefly report on the four shared tasks organized as part of the PAN 2020 evaluation lab on digital text forensics and authorship analysis. Each tasks is introduced, motivated, and the results obtained are presented. Altogether, the four tasks attracted 230 registrations, yielding 83 successful submissions. This, and the fact that we continue to invite the submissions of software rather than its run output using the TIRA experimentation platform, marks for a good start into the second decade of PAN evaluations labs.We thank Symanto for sponsoring the ex aequo award for the two best performing systems at the author profiling shared task of this year on Profiling fake news spreaders on Twitter. The work of Paolo Rosso was partially funded by the Spanish MICINN under the research project MISMIS-FAKEnHATE on Misinformation and Miscommunication in social media: FAKE news and HATE speech (PGC2018¿096212-B-C31). The work of Anastasia Giachanou is supported by the SNSF Early Postdoc Mobility grant under the project Early Fake News Detection on Social Media, Switzerland (P2TIP2_181441).Bevendorff, J.; Ghanem, BHH.; Giachanou, A.; Kestemont, M.; Manjavacas, E.; Markov, I.; Mayerl, M.... (2020). Overview of PAN 2020: Authorship Verification, Celebrity Profiling, Profiling Fake News Spreaders on Twitter, and Style Change Detection. Springer. 372-383. https://doi.org/10.1007/978-3-030-58219-7_25S372383Bevendorff, J., et al.: Shared tasks on authorship analysis at PAN 2020. In: Jose, J.M., et al. (eds.) ECIR 2020. LNCS, vol. 12036, pp. 508–516. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-45442-5_66Bevendorff, J., Stein, B., Hagen, M., Potthast, M.: Bias analysis and mitigation in the evaluation of authorship verification. In: 57th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 6301–6306 (2019)Bevendorff, J., Stein, B., Hagen, M., Potthast, M.: Generalizing unmasking for short texts. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT, pp. 654–659 (2019)Ghanem, B., Rosso, P., Rangel, F.: An emotional analysis of false information in social media and news articles. ACM Trans. Internet Technol. (TOIT) 20(2), 1–18 (2020)Giachanou, A., Ríssola, E.A., Ghanem, B., Crestani, F., Rosso, Paolo: The role of personality and linguistic patterns in discriminating between fake news spreaders and fact checkers. In: Métais, E., Meziane, F., Horacek, H., Cimiano, P. (eds.) NLDB 2020. LNCS, vol. 12089, pp. 181–192. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-51310-8_17Giachanou, A., Rosso, P., Crestani, F.: Leveraging emotional signals for credibility detection. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 877–880 (2019)Kestemont, M., Stamatatos, E., Manjavacas, E., Daelemans, W., Potthast, M., Stein, B.: Overview of the cross-domain authorship attribution task at PAN 2019. In: Working Notes Papers of the CLEF 2019 Evaluation Labs. CEUR Workshop Proceedings (2019)Kestemont, M., et al.: Overview of the author identification task at PAN-2018: cross-domain authorship attribution and style change detection. In: Working Notes Papers of the CLEF 2018 Evaluation Labs. CEUR Workshop Proceedings (2018)Peñas, A., Rodrigo, A.: A simple measure to assess non-response. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (2011)Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)Rangel, F., Giachanou, A., Ghanem, B., Rosso, P.: Overview of the 8th author profiling task at PAN 2020: profiling fake news spreaders on Twitter. In: CLEF 2020 Labs and Workshops, Notebook Papers (2020)Rangel, F., Franco-Salvador, M., Rosso, P.: A low dimensionality representation for language variety identification. In: Gelbukh, A. (ed.) CICLing 2016. LNCS, vol. 9624, pp. 156–169. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-75487-1_13Shu, K., Wang, S., Liu, H.: Understanding user profiles on social media for fake news detection. In: 2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), pp. 430–435 (2018)Vo, N., Lee, K.: Learning from fact-checkers: analysis and generation of fact-checking language. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (2019)Noreen, E.W.: Computer-Intensive Methods for Testing Hypotheses: An Introduction. A Wiley-Interscience Publication, Hoboken (1989)Wiegmann, M., Potthast, M., Stein, B.: Overview of the celebrity profiling task at PAN 2020. In: CLEF 2020 Labs and Workshops, Notebook Papers (2020)Wiegmann, M., Stein, B., Potthast, M.: Celebrity profiling. In: 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019). Association for Computational Linguistics (2019)Wiegmann, M., Stein, B., Potthast, M.: Overview of the celebrity profiling task at PAN 2019. In: CLEF 2019 Labs and Workshops, Notebook Papers (2019)Zangerle, E., Mayerl, M., Specht, G., Potthast, M., Stein, B.: Overview of the style change detection task at PAN 2020. In: CLEF 2020 Labs and Workshops, Notebook Papers (2020)Zangerle, E., Tschuggnall, M., Specht, G., Potthast, M., Stein, B.: Overview of the style change detection task at PAN 2019. In: CLEF 2019 Labs and Workshops, Notebook Papers (2019
    corecore