5,431 research outputs found

    COVIDFakeExplainer: An Explainable Machine Learning based Web Application for Detecting COVID-19 Fake News

    Full text link
    Fake news has emerged as a critical global issue, magnified by the COVID-19 pandemic, underscoring the need for effective preventive tools. Leveraging machine learning, including deep learning techniques, offers promise in combatting fake news. This paper goes beyond by establishing BERT as the superior model for fake news detection and demonstrates its utility as a tool to empower the general populace. We have implemented a browser extension, enhanced with explainability features, enabling real-time identification of fake news and delivering easily interpretable explanations. To achieve this, we have employed two publicly available datasets and created seven distinct data configurations to evaluate three prominent machine learning architectures. Our comprehensive experiments affirm BERT's exceptional accuracy in detecting COVID-19-related fake news. Furthermore, we have integrated an explainability component into the BERT model and deployed it as a service through Amazon's cloud API hosting (AWS). We have developed a browser extension that interfaces with the API, allowing users to select and transmit data from web pages, receiving an intelligible classification in return. This paper presents a practical end-to-end solution, highlighting the feasibility of constructing a holistic system for fake news detection, which can significantly benefit society.Comment: 7 pages, 4 figure

    Uncovering Semantic Inconsistencies and Deceptive Language in False News Using Deep Learning and NLP Techniques for Effective Management

    Get PDF
    In today's information age, false news and deceptive language have become pervasive, leading to significant challenges for individuals, organizations, and society as a whole. This study focuses on the application of deep learning and natural language processing (NLP) techniques to uncover semantic inconsistencies and deceptive language in false news, with the aim of facilitating effective management strategies. The research employs advanced deep learning models and NLP algorithms to analyze large volumes of textual data and identify patterns indicative of deceptive language and semantic inconsistencies. By leveraging the power of machine learning, the study aims to enhance the detection and classification of false news articles, enabling proactive management measures. The proposed approach not only examines the superficial aspects of false news but also delves deeper into the linguistic nuances and contextual inconsistencies that are characteristic of deceptive language. By employing advanced NLP techniques, such as sentiment analysis, topic modeling, and named entity recognition, the study strives to identify the underlying manipulative strategies employed by false news purveyors. The findings from this research have far-reaching implications for effective management. By accurately detecting semantic inconsistencies and deceptive language in false news, organizations can develop targeted strategies to mitigate the spread and impact of misinformation. Additionally, individuals can make informed decisions, enhancing their ability to critically evaluate news sources and protect themselves from falling victim to deceptive practices. In this research study, we suggest a hybrid system for detecting fake news that incorporates source analysis and machine learning techniques. Our system analyzes the language used in news articles to identify indicators of fake news and evaluates the credibility of the sources cited in the articles. We trained our system using a large dataset of news articles manually annotated as real or fake and evaluated its performance measured by common metrics like F1-score, recall, and precision. In comparison to other advanced fake news detection systems, our results show that our hybrid method has a high level of accuracy in detecting false news

    Ensemble machine learning approaches for fake news classification

    Get PDF
    In today’s interconnected digital landscape, the proliferation of fake news has become a significant challenge, with far-reaching implications for individuals, institutions, and societies. The rapid spread of misleading information undermines the credibility of genuine news outlets and threatens informed decision-making, public trust, and democratic processes. Recognizing the profound relevance and urgency of addressing this issue, this research embarked on a mission to harness the power of machine learning to combat fake news menace. This study develops an ensemble machine learning model for fake news classification. The research is targeted at spreading fake news. The research subjects are machine learning methods for misinformation classification. Methods: we employed three state-of-the-art algorithms: LightGBM, XGBoost, and Balanced Random Forest (BRF). Each model was meticulously trained on a comprehensive dataset curated to encompass a diverse range of news articles, ensuring a broad representation of linguistic patterns and styles. A distinctive feature of the proposed approach is the emphasis on token importance. By leveraging specific tokens that exhibited a high degree of influence on classification outcomes, we enhanced the precision and reliability of the developed models. The empirical results were both promising and illuminating. The LightGBM model emerged as the top performer among the three, registering an impressive F1-score of 97.74% and an accuracy rate of 97.64%. Notably, all three of the proposed models consistently outperformed several existing models previously documented in academic literature. This comparative analysis underscores the efficacy and superiority of the proposed ensemble approach. In conclusion, this study contributes a robust, innovative, and scalable solution to the pressing challenge of fake news detection. By harnessing the capabilities of advanced machine learning techniques, the research findings pave the way for enhancing the integrity and veracity of information in an increasingly digitalized world, thereby safeguarding public trust and promoting informed discourse

    False News On Social Media: A Data-Driven Survey

    Full text link
    In the past few years, the research community has dedicated growing interest to the issue of false news circulating on social networks. The widespread attention on detecting and characterizing false news has been motivated by considerable backlashes of this threat against the real world. As a matter of fact, social media platforms exhibit peculiar characteristics, with respect to traditional news outlets, which have been particularly favorable to the proliferation of deceptive information. They also present unique challenges for all kind of potential interventions on the subject. As this issue becomes of global concern, it is also gaining more attention in academia. The aim of this survey is to offer a comprehensive study on the recent advances in terms of detection, characterization and mitigation of false news that propagate on social media, as well as the challenges and the open questions that await future research on the field. We use a data-driven approach, focusing on a classification of the features that are used in each study to characterize false information and on the datasets used for instructing classification methods. At the end of the survey, we highlight emerging approaches that look most promising for addressing false news

    Fake News Detection in Social Networks via Crowd Signals

    Full text link
    Our work considers leveraging crowd signals for detecting fake news and is motivated by tools recently introduced by Facebook that enable users to flag fake news. By aggregating users' flags, our goal is to select a small subset of news every day, send them to an expert (e.g., via a third-party fact-checking organization), and stop the spread of news identified as fake by an expert. The main objective of our work is to minimize the spread of misinformation by stopping the propagation of fake news in the network. It is especially challenging to achieve this objective as it requires detecting fake news with high-confidence as quickly as possible. We show that in order to leverage users' flags efficiently, it is crucial to learn about users' flagging accuracy. We develop a novel algorithm, DETECTIVE, that performs Bayesian inference for detecting fake news and jointly learns about users' flagging accuracy over time. Our algorithm employs posterior sampling to actively trade off exploitation (selecting news that maximize the objective value at a given epoch) and exploration (selecting news that maximize the value of information towards learning about users' flagging accuracy). We demonstrate the effectiveness of our approach via extensive experiments and show the power of leveraging community signals for fake news detection

    Identifying Misinformation on YouTube through Transcript Contextual Analysis with Transformer Models

    Full text link
    Misinformation on YouTube is a significant concern, necessitating robust detection strategies. In this paper, we introduce a novel methodology for video classification, focusing on the veracity of the content. We convert the conventional video classification task into a text classification task by leveraging the textual content derived from the video transcripts. We employ advanced machine learning techniques like transfer learning to solve the classification challenge. Our approach incorporates two forms of transfer learning: (a) fine-tuning base transformer models such as BERT, RoBERTa, and ELECTRA, and (b) few-shot learning using sentence-transformers MPNet and RoBERTa-large. We apply the trained models to three datasets: (a) YouTube Vaccine-misinformation related videos, (b) YouTube Pseudoscience videos, and (c) Fake-News dataset (a collection of articles). Including the Fake-News dataset extended the evaluation of our approach beyond YouTube videos. Using these datasets, we evaluated the models distinguishing valid information from misinformation. The fine-tuned models yielded Matthews Correlation Coefficient>0.81, accuracy>0.90, and F1 score>0.90 in two of three datasets. Interestingly, the few-shot models outperformed the fine-tuned ones by 20% in both Accuracy and F1 score for the YouTube Pseudoscience dataset, highlighting the potential utility of this approach -- especially in the context of limited training data

    Automated Crowdturfing Attacks and Defenses in Online Review Systems

    Full text link
    Malicious crowdsourcing forums are gaining traction as sources of spreading misinformation online, but are limited by the costs of hiring and managing human workers. In this paper, we identify a new class of attacks that leverage deep learning language models (Recurrent Neural Networks or RNNs) to automate the generation of fake online reviews for products and services. Not only are these attacks cheap and therefore more scalable, but they can control rate of content output to eliminate the signature burstiness that makes crowdsourced campaigns easy to detect. Using Yelp reviews as an example platform, we show how a two phased review generation and customization attack can produce reviews that are indistinguishable by state-of-the-art statistical detectors. We conduct a survey-based user study to show these reviews not only evade human detection, but also score high on "usefulness" metrics by users. Finally, we develop novel automated defenses against these attacks, by leveraging the lossy transformation introduced by the RNN training and generation cycle. We consider countermeasures against our mechanisms, show that they produce unattractive cost-benefit tradeoffs for attackers, and that they can be further curtailed by simple constraints imposed by online service providers
    • …
    corecore