2 research outputs found

    Fake News Detection with the GREEK-BERT Model with a focus on COVID-19

    Get PDF
    Οι ψευδείς ειδήσης, αν και είναι ένα πρόβλημα που παρουσιάζεται από του αρχαίους καιρούς, είναι ένα απο τα κύρια πολιτικά και κοινωνικά προβλήματα τα τελευταία χρόνια. Το πρόβλημα γίνεται ακόμα μεγαλύτερο λόγω της διείσδησης των κοινωνικών δικτύων σε μεγάλο μέρος του πληθυσμού. Ειδικότερα κατά την διάρκεια της πανδημίας του COVID19, η διασπορά ψευδών ειδήσεων μπορεί να έχει πολύ σοβαρές και ακόμα και θανάσιμες παρενέργιες για τις κοινωνίες και τους πολίτες. Η παρούσα εργασία περιγράφει την δουλειά γύρω από την δημιουργία δυο μοντέλων κατηγοριοποίησης ψευδών ειδήσεων και ψευδών αναρτήσεων κοινωνικών δικτύων, μαζί με μια διαδικτυακή εφαρμογή για την μελέτη των σχέσεων και των μοτίβων διάδοσης ψευδών και αληθών πληροφοριών σε πλατφόρμες κοινωνικής δικτύωσης. Η δουλειά μας χρησιμοποιεί την Ελληνική γλώσσα και στοχεύει σε πληροφορίες που έχουν σχέση με την τρέχουσα πανδημία του κορωνοϊού. Επίσης παρουσιάζουμε μια επισκόπηση των ερευνών πάνω στις οποίες βασίζουμε τα μοντέλα μας, καθώς και άλλες έρευνες σχετικές με την αναγνώριση ψευδών ειδήσεων. Για αυτό το σκοπό, επαναχρησιμοποιήσαμε ένα προϋπάρχον Ελληνικό σύνολο δεδομένων, το οποίο ήταν μέρος της Διπλωματικής του Οδυσσέα Τρισπιώτη [1], και επίσης δημιουργήσαμε ένα νέο σύνολο δεδομένων για τους σκοπούς αυτού του έργου. Κατα την διάρκεια της δημιουργίας αυτού του νέου συνόλου δεδομένω, παρατηρήσαμε πως η εύρεση αξιόπιστων πηγών ψευδών αναρτήσεων είναι ένα δύσκολο πρόβλημα, που γίνεται ακόμα δυσκολότερα αυτοματοποιήσιμο. Η βάση για τα μοντέλα κατηγοριοποίησης που αναπτύξαμε είναι τα μοντέλα τεχνολογίας αιχμής BERT [2] και GREEK-BERT [3]. Τα αποτελέσματα της άνωθι διαδικασίας ήταν εξόχως ενθαρυντικά, καθώς τα τελικά μοντέλα κατηγοριοποίησης έφτασαν accuracy επιπέδου μεγαλύτερου του 90%, και εξίσου καλά αποτελέσματα σε άλλες παραδοσιακές μετρικές κατηγοριοποίησης δεδομένων, όπως precision, recall, f1 score και AUROC.Fake news, while being a problem appearing since the ancient times, is one of the major political and societal issues of recent years. The issue becomes even more important by the prevalence of social media use by the general public. Especially during the COVID19 pandemic, fake news dissemination can have very serious and even fatal side effects for societies as well as individuals. This thesis outlines our work in creating two classification models for fake news and fake social media posts, alongside a web application for studying the relationships and dissemination patterns of fake and non fake information in social media platforms. Our work is target at the Greek language and the ongoing coronavirus pandemic. We also present an overview of the research work on which we base our models, as well as related research endeavors regarding fake news detection. For this purpose we have reused an existing Greek fake news data set, which was part of Odysseas Trispiotis' Master in Science Thesis [1], and we have also created a novel data set for the purposes of this project. In the process of generating this novel data set, we have observed that finding reliable fake post sources is a hard problem, even more so to automate it. The basis of the our classification models are the state of the art BERT [2] and GREEK-BERT [3] models. The results of the above process were very encouraging, as the final classification models reached accuracy levels greater than 90%, with similarly good scores for other traditional classification metrics, such as precision, recall, f1 score and AUROC

    Dismiss : uma abordagem para análise sociotécnica da desinformação digital

    Get PDF
    Orientador: Dr. Roberto PereiraTese (doutorado) - Universidade Federal do Paraná, Setor de Ciencias Exatas, Programa de Pós-Graduação em Informática. Defesa : Curitiba, 28/08/2023Inclui referênciasÁrea de concentração: Ciência da ComputaçãoResumo: Essa tese aborda o desafio de entender e lidar com a desinformação digital como um fenômeno sociotécnico, ou seja, que envolve tanto aspectos das tecnologias utilizadas para comunicação quanto do contexto humano/social em que a desinformação ocorre. Os resultados de nosso mapeamento sistemático da literatura mostraram que projetistas de intervenções para mitigação da desinformação têm dificuldades em lidar com a natureza sociotécnica do fenômeno, tendem a utilizar abordagens disciplinares focadas nos aspectos técnicos da desinformação e abordam o fenômeno de forma segmentada. Essas dificuldades podem levar os projetistas à ignorarem aspectos relevantes para o entendimento do fenômeno e à soluções com potenciais prejudiciais, como a censura ou avisos invasivos. Nesse sentido, essa tese investiga meios para apoiar projetistas a compreenderem o fenômeno pela perspectiva sociotécnica, ajudando a caracterizar casos de desinformação digital e auxiliando no entendimento abrangente de problemas. Como solução, essa tese apresenta a Dismiss - uma aborDagem para análIse Sociotécnica de Deinformações DigItaiS. A Dismiss é fundamentada na Semiótica Organizacional, composta pelo Modelo Conceitual do Ciclo de Vida da Desinformação Digital, artefatos e materiais de apoio que amparam a análise sociotécnica da desinformação. A abordagem representa uma ferramenta epistêmica projetada para proporcionar a reflexão de seus utilizadores sobre as circunstâncias em que a desinformação ocorre, auxiliando na compreensão da origem e consequências da desinformação digital. A Dismiss foi avaliada de forma construtiva ao longo de seu desenvolvimento, usando métodos de grupo focal (11 encontros), estudos em pequena escala (7 casos), e oficinas de análise sociotécnica de casos de desinformação digital com representantes do público-alvo (3 oficinas). Os resultados dos grupos focais e estudos em pequena escala informaram o refinamento da abordagem, sua estrutura, componentes e métodos de aplicação. Os resultados das oficinas indicam a utilidade percebida da abordagem em apoiar a compreensão da desinformação como um fenômeno sociotécnico. Os resultados também indicaram aspectos que podem ser aprimorados na Dismiss, como a quantidade de passos, a explicação de artefatos, e a densidade dos materiais de apoio, informando melhoriasAbstract: This thesis addresses the challenge of understanding and dealing with digital misinformation as a sociotechnical phenomenon, meaning that it involves both aspects of the technologies used for communication and the human/social context in which misinformation occurs. The results of our systematic literature review showed that designers of interventions for mitigating misinformation face difficulties in dealing with the sociotechnical nature of the phenomenon. They tend to employ disciplinary approaches focused on the technical aspects of misinformation and often address the phenomenon in a fragmented manner. These difficulties can lead designers to overlook relevant aspects for understanding the phenomenon and result in potentially harmful solutions, such as censorship or invasive warnings. In this regard, this thesis investigates means to support designers in comprehending the phenomenon from a sociotechnical perspective, helping to characterize cases of digital misinformation and aiding in a comprehensive understanding of the issues. As a solution, this thesis presents Dismiss - an Approach for Sociotechnical Analysis of Digital Misinformation. Dismiss is grounded in Organizational Semiotics, comprised of the Conceptual Model of the Digital Misinformation Lifecycle, artifacts, and supporting materials that underpin the sociotechnical analysis of misinformation. The approach serves as an epistemic tool designed to facilitate users’ reflection on the circumstances in which misinformation occurs, assisting in understanding the origins and consequences of digital misinformation. Dismiss was constructively evaluated throughout its development, utilizing focus group methods (11 meetings), small-scale studies (7 cases), and workshops for the sociotechnical analysis of digital misinformation cases with representatives of the target audience (3 workshops). The results from the focus groups and small-scale studies informed the refinement of the approach, its structure, components, and application methods. The workshop results indicate the perceived utility of the approach in supporting the understanding of misinformation as a sociotechnical phenomenon. The results also highlighted aspects that can be improved in Dismiss, such as the number of steps, artifact explanations, and the density of supporting materials, providing insights for enhancement
    corecore