12 research outputs found

    Handling Imbalanced Classes for Model Training in Fake News Detection

    Get PDF
    With the widespread dissemination of news on social media platforms, the propagation of fake news has become a pressing concern. Detecting fake news is crucial to maintaining the integrity of information shared across social networks. This paper presents a comprehensive investigation into the detection of fake news on social media, focusing on the collection of data from both reliable and unreliable sources To build an effective fake news detection system, a diverse dataset encompassing both reliable and unreliable sources is collected.  This data collection strategy ensures a comprehensive representation of the information landscape present on social media platforms.The implementation of the bidirectional LSTM with an attention layer is a powerful approach that has shown promising results in various natural language processing tasks, including text classification and sentiment analysis. Its effectiveness lies in its ability to leverage both directional information and attention-driven focus, allowing the model to better understand and interpret the nuances of the input sequence. Calculate the class weights using the inverse of the importance factors to give proper weight to each label. Balancing has been tried with the class weight method the system has given almost 60% accuracy

    Exploring machine learning techniques for fake profile detection in online social networks

    Get PDF
    The online social network is the largest network, more than 4 billion users use social media and with its rapid growth, the risk of maintaining the integrity of data has tremendously increased. There are several kinds of security challenges in online social networks (OSNs). Many abominable behaviors try to hack social sites and misuse the data available on these sites. Therefore, protection against such behaviors has become an essential requirement. Though there are many types of security threats in online social networks but, one of the significant threats is the fake profile. Fake profiles are created intentionally with certain motives, and such profiles may be targeted to steal or acquire sensitive information and/or spread rumors on online social networks with specific motives. Fake profiles are primarily used to steal or extract information by means of friendly interaction online and/or misusing online data available on social sites. Thus, fake profile detection in social media networks is attracting the attention of researchers. This paper aims to discuss various machine learning (ML) methods used by researchers for fake profile detection to explore the further possibility of improvising the machine learning models for speedy results

    Words are the Window to the Soul: Language-based User Representations for Fake News Detection

    Get PDF
    Cognitive and social traits of individuals are reflected in language use. Moreover, individuals who are prone to spread fake news online often share common traits. Building on these ideas, we introduce a model that creates representations of individuals on social media based only on the language they produce, and use them to detect fake news. We show that language-based user representations are beneficial for this task. We also present an extended analysis of the language of fake news spreaders, showing that its main features are mostly domain independent and consistent across two English datasets. Finally, we exploit the relation between language use and connections in the social graph to assess the presence of the Echo Chamber effect in our data.Comment: 9 pages, accepted at COLING 202

    MUFFLE: Multi-Modal Fake News Influence Estimator on Twitter

    Get PDF
    To alleviate the impact of fake news on our society, predicting the popularity of fake news posts on social media is a crucial problem worthy of study. However, most related studies on fake news emphasize detection only. In this paper, we focus on the issue of fake news influence prediction, i.e., inferring how popular a fake news post might become on social platforms. To achieve our goal, we propose a comprehensive framework, MUFFLE, which captures multi-modal dynamics by encoding the representation of news-related social networks, user characteristics, and content in text. The attention mechanism developed in the model can provide explainability for social or psychological analysis. To examine the effectiveness of MUFFLE, we conducted extensive experiments on real-world datasets. The experimental results show that our proposed method outperforms both state-of-the-art methods of popularity prediction and machine-based baselines in top-k NDCG and hit rate. Through the experiments, we also analyze the feature importance for predicting fake news influence via the explainability provided by MUFFLE

    Unified Fake News Detection System (UFNDS) Framework

    Get PDF
    The deliberate spread of misleading or inaccurate material pose as authentic news is known as "fake news." Its increasing prevalence calls for the creation of practical strategies to recognize and counteract its negative effects on people and society. Previous methods of identifying fake news depended on linguistic signals and stylistic components. However, these methods faced limitations in terms of their applicability and accuracy. To overcome these constraints, this study proposes the utilization of an extended stacking ensemble classification algorithm (ES-ECA), a machine learning technique designed specifically for detecting fake news. By employing this innovative approach, we aim to surpass the existing barriers and enhance our ability to combat misinformation. The ensemble classifier outperformed the individual classifiers, with an accuracy of 75.18% and an F1-score of 81.81%. These findings imply that the suggested algorithm can be utilized to lessen the negative effects of fake news on society and is efficient at identifying it. The EHT-DL model leverages a multi-step approach to effectively detect fake news. It begins with preprocessing steps such as text normalization, special character handling, stemming, stop word removal, tokenization, and lemmatization. This ensures the dataset is clean and ready for subsequent processing. Feature extraction is performed using TF-IDF, N-grams, and word embeddings scores to capture semantic information and word importance. After that, the dataset is divided into training and testing sets and the deep learning model Dl4jMlpClassifier is used to classify the data. To tackle the drawbacks of existing techniques, the EHT-DL model incorporates efficient hyperparameter tuning. It uses both Grid Search and Random Search methods to optimize the Dl4jMlpClassifier's hyperparameters. By using this method, the model's accuracy and capacity to distinguish between authentic and fraudulent news are both improved. The effectiveness of the EHT-DL model is shown by the experimental findings. Standard assessment criteria including accuracy, precision, recall, and F1-score are used to assess the model. In terms of accuracy and efficiency, comparisons with current methods demonstrate the superiority of the proposed model in identifying bogus news (83.27% accuracy, 80.62% precision, 71.57% recall, and 75.63% f1-score). To increase classification accuracy and resilience, OE-MDL combines the phases of optimized deep learning (ODL) and optimized machine learning (OML). An optimized Multilayer Perceptron serves as the Meta classifier in the OML phase, on top of base classifiers such as optimized RandomForest, optimized J48, optimized SMO, optimized NaiveBayes, and optimized IBk. The experimental findings show that the OE-MDL algorithm performs better than other methods with the maximum recall (85.18%), accuracy (84.27%), precision (74.17%), and F1-Score (79.29%), providing a practical means of halting the spread of false information. The framework for the Unified Fake News Detection System (UFNDS) is revealed. The final result of the research works and demonstrates how the three main stages—Deep Learning and Optimized Ensemble Machine (OE-MDL), Efficient Hyperparameter-Tuned Deep Learning Model (EHT-DL), and Enhanced Stacking Ensemble Classification Algorithm (ES-ECA)—are cohesively incorporated into the UFNDS framework. We examine the UFNDS framework's architectural design and show how flexible it is to the always changing problems associated with fake news identification

    Modeling microscopic and macroscopic information diffusion for rumor detection

    Get PDF
    Researchers have exerted tremendous effort in designing ways to detect and identify rumors automatically. Traditional approaches focus on feature engineering, which requires extensive manual efforts and are difficult to generalize to different domains. Recently, deep learning solutions have emerged as the de facto methods which detect online rumors in an end-to-end manner. However, they still fail to fully capture the dissemination patterns of rumors. In this study, we propose a novel diffusion-based rumor detection model, called Macroscopic and Microscopic-aware Rumor Detection, to explore the full-scale diffusion patterns of information. It leverages graph neural networks to learn the macroscopic diffusion of rumor propagation and capture microscopic diffusion patterns using bidirectional recurrent neural networks while taking into account the user-time series. Moreover, it leverages knowledge distillation technique to create a more informative student model and further improve the model performance. Experiments conducted on two real-world data sets demonstrate that our method achieves significant accuracy improvements over the state-of-the-art baseline models on rumor detection.Computer Science

    SoK: Content Moderation in Social Media, from Guidelines to Enforcement, and Research to Practice

    Full text link
    To counter online abuse and misinformation, social media platforms have been establishing content moderation guidelines and employing various moderation policies. The goal of this paper is to study these community guidelines and moderation practices, as well as the relevant research publications to identify the research gaps, differences in moderation techniques, and challenges that should be tackled by the social media platforms and the research community at large. In this regard, we study and analyze in the US jurisdiction the fourteen most popular social media content moderation guidelines and practices, and consolidate them. We then introduce three taxonomies drawn from this analysis as well as covering over one hundred interdisciplinary research papers about moderation strategies. We identified the differences between the content moderation employed in mainstream social media platforms compared to fringe platforms. We also highlight the implications of Section 230, the need for transparency and opacity in content moderation, why platforms should shift from a one-size-fits-all model to a more inclusive model, and lastly, we highlight why there is a need for a collaborative human-AI system

    Catch me if you can: a participant-level rumor detection framework via fine-grained user representation learning

    Get PDF
    Researchers have exerted tremendous effort in designing ways to detect and identify rumors automatically. Traditional approaches focus on feature engineering. They require lots of human actions and are difficult to generalize. Deep learning solutions come to help. However, they usually fail to capture the underlying structure of the rumor propagation and the influence of all participants involved in the spreading chain. In this study, we propose a novel participant level rumor detection framework. It explicitly models and integrates various fine-grained user representations (i.e., user influence, susceptibility, and temporal information) of all participants from the propagation threads via deep representation learning. Experiments conducted on real world datasets demonstrate a significant accuracy improvement of our approach. Theoretically, we contribute to the effective usage of data science and analytics for social information diffusion design, particularly rumor detection. Practically, our results can be used to improve the quality of rumor detection services for social platforms.Computer Science
    corecore