5,431 research outputs found
COVIDFakeExplainer: An Explainable Machine Learning based Web Application for Detecting COVID-19 Fake News
Fake news has emerged as a critical global issue, magnified by the COVID-19
pandemic, underscoring the need for effective preventive tools. Leveraging
machine learning, including deep learning techniques, offers promise in
combatting fake news. This paper goes beyond by establishing BERT as the
superior model for fake news detection and demonstrates its utility as a tool
to empower the general populace. We have implemented a browser extension,
enhanced with explainability features, enabling real-time identification of
fake news and delivering easily interpretable explanations. To achieve this, we
have employed two publicly available datasets and created seven distinct data
configurations to evaluate three prominent machine learning architectures. Our
comprehensive experiments affirm BERT's exceptional accuracy in detecting
COVID-19-related fake news. Furthermore, we have integrated an explainability
component into the BERT model and deployed it as a service through Amazon's
cloud API hosting (AWS). We have developed a browser extension that interfaces
with the API, allowing users to select and transmit data from web pages,
receiving an intelligible classification in return. This paper presents a
practical end-to-end solution, highlighting the feasibility of constructing a
holistic system for fake news detection, which can significantly benefit
society.Comment: 7 pages, 4 figure
Uncovering Semantic Inconsistencies and Deceptive Language in False News Using Deep Learning and NLP Techniques for Effective Management
In today's information age, false news and deceptive language have become pervasive, leading to significant challenges for individuals, organizations, and society as a whole. This study focuses on the application of deep learning and natural language processing (NLP) techniques to uncover semantic inconsistencies and deceptive language in false news, with the aim of facilitating effective management strategies.
The research employs advanced deep learning models and NLP algorithms to analyze large volumes of textual data and identify patterns indicative of deceptive language and semantic inconsistencies. By leveraging the power of machine learning, the study aims to enhance the detection and classification of false news articles, enabling proactive management measures. The proposed approach not only examines the superficial aspects of false news but also delves deeper into the linguistic nuances and contextual inconsistencies that are characteristic of deceptive language. By employing advanced NLP techniques, such as sentiment analysis, topic modeling, and named entity recognition, the study strives to identify the underlying manipulative strategies employed by false news purveyors.
The findings from this research have far-reaching implications for effective management. By accurately detecting semantic inconsistencies and deceptive language in false news, organizations can develop targeted strategies to mitigate the spread and impact of misinformation. Additionally, individuals can make informed decisions, enhancing their ability to critically evaluate news sources and protect themselves from falling victim to deceptive practices.
In this research study, we suggest a hybrid system for detecting fake news that incorporates source analysis and machine learning techniques. Our system analyzes the language used in news articles to identify indicators of fake news and evaluates the credibility of the sources cited in the articles. We trained our system using a large dataset of news articles manually annotated as real or fake and evaluated its performance measured by common metrics like F1-score, recall, and precision. In comparison to other advanced fake news detection systems, our results show that our hybrid method has a high level of accuracy in detecting false news
Ensemble machine learning approaches for fake news classification
In today’s interconnected digital landscape, the proliferation of fake news has become a significant challenge, with far-reaching implications for individuals, institutions, and societies. The rapid spread of misleading information undermines the credibility of genuine news outlets and threatens informed decision-making, public trust, and democratic processes. Recognizing the profound relevance and urgency of addressing this issue, this research embarked on a mission to harness the power of machine learning to combat fake news menace. This study develops an ensemble machine learning model for fake news classification. The research is targeted at spreading fake news. The research subjects are machine learning methods for misinformation classification. Methods: we employed three state-of-the-art algorithms: LightGBM, XGBoost, and Balanced Random Forest (BRF). Each model was meticulously trained on a comprehensive dataset curated to encompass a diverse range of news articles, ensuring a broad representation of linguistic patterns and styles. A distinctive feature of the proposed approach is the emphasis on token importance. By leveraging specific tokens that exhibited a high degree of influence on classification outcomes, we enhanced the precision and reliability of the developed models. The empirical results were both promising and illuminating. The LightGBM model emerged as the top performer among the three, registering an impressive F1-score of 97.74% and an accuracy rate of 97.64%. Notably, all three of the proposed models consistently outperformed several existing models previously documented in academic literature. This comparative analysis underscores the efficacy and superiority of the proposed ensemble approach. In conclusion, this study contributes a robust, innovative, and scalable solution to the pressing challenge of fake news detection. By harnessing the capabilities of advanced machine learning techniques, the research findings pave the way for enhancing the integrity and veracity of information in an increasingly digitalized world, thereby safeguarding public trust and promoting informed discourse
False News On Social Media: A Data-Driven Survey
In the past few years, the research community has dedicated growing interest
to the issue of false news circulating on social networks. The widespread
attention on detecting and characterizing false news has been motivated by
considerable backlashes of this threat against the real world. As a matter of
fact, social media platforms exhibit peculiar characteristics, with respect to
traditional news outlets, which have been particularly favorable to the
proliferation of deceptive information. They also present unique challenges for
all kind of potential interventions on the subject. As this issue becomes of
global concern, it is also gaining more attention in academia. The aim of this
survey is to offer a comprehensive study on the recent advances in terms of
detection, characterization and mitigation of false news that propagate on
social media, as well as the challenges and the open questions that await
future research on the field. We use a data-driven approach, focusing on a
classification of the features that are used in each study to characterize
false information and on the datasets used for instructing classification
methods. At the end of the survey, we highlight emerging approaches that look
most promising for addressing false news
Fake News Detection in Social Networks via Crowd Signals
Our work considers leveraging crowd signals for detecting fake news and is
motivated by tools recently introduced by Facebook that enable users to flag
fake news. By aggregating users' flags, our goal is to select a small subset of
news every day, send them to an expert (e.g., via a third-party fact-checking
organization), and stop the spread of news identified as fake by an expert. The
main objective of our work is to minimize the spread of misinformation by
stopping the propagation of fake news in the network. It is especially
challenging to achieve this objective as it requires detecting fake news with
high-confidence as quickly as possible. We show that in order to leverage
users' flags efficiently, it is crucial to learn about users' flagging
accuracy. We develop a novel algorithm, DETECTIVE, that performs Bayesian
inference for detecting fake news and jointly learns about users' flagging
accuracy over time. Our algorithm employs posterior sampling to actively trade
off exploitation (selecting news that maximize the objective value at a given
epoch) and exploration (selecting news that maximize the value of information
towards learning about users' flagging accuracy). We demonstrate the
effectiveness of our approach via extensive experiments and show the power of
leveraging community signals for fake news detection
Identifying Misinformation on YouTube through Transcript Contextual Analysis with Transformer Models
Misinformation on YouTube is a significant concern, necessitating robust
detection strategies. In this paper, we introduce a novel methodology for video
classification, focusing on the veracity of the content. We convert the
conventional video classification task into a text classification task by
leveraging the textual content derived from the video transcripts. We employ
advanced machine learning techniques like transfer learning to solve the
classification challenge. Our approach incorporates two forms of transfer
learning: (a) fine-tuning base transformer models such as BERT, RoBERTa, and
ELECTRA, and (b) few-shot learning using sentence-transformers MPNet and
RoBERTa-large. We apply the trained models to three datasets: (a) YouTube
Vaccine-misinformation related videos, (b) YouTube Pseudoscience videos, and
(c) Fake-News dataset (a collection of articles). Including the Fake-News
dataset extended the evaluation of our approach beyond YouTube videos. Using
these datasets, we evaluated the models distinguishing valid information from
misinformation. The fine-tuned models yielded Matthews Correlation
Coefficient>0.81, accuracy>0.90, and F1 score>0.90 in two of three datasets.
Interestingly, the few-shot models outperformed the fine-tuned ones by 20% in
both Accuracy and F1 score for the YouTube Pseudoscience dataset, highlighting
the potential utility of this approach -- especially in the context of limited
training data
Automated Crowdturfing Attacks and Defenses in Online Review Systems
Malicious crowdsourcing forums are gaining traction as sources of spreading
misinformation online, but are limited by the costs of hiring and managing
human workers. In this paper, we identify a new class of attacks that leverage
deep learning language models (Recurrent Neural Networks or RNNs) to automate
the generation of fake online reviews for products and services. Not only are
these attacks cheap and therefore more scalable, but they can control rate of
content output to eliminate the signature burstiness that makes crowdsourced
campaigns easy to detect.
Using Yelp reviews as an example platform, we show how a two phased review
generation and customization attack can produce reviews that are
indistinguishable by state-of-the-art statistical detectors. We conduct a
survey-based user study to show these reviews not only evade human detection,
but also score high on "usefulness" metrics by users. Finally, we develop novel
automated defenses against these attacks, by leveraging the lossy
transformation introduced by the RNN training and generation cycle. We consider
countermeasures against our mechanisms, show that they produce unattractive
cost-benefit tradeoffs for attackers, and that they can be further curtailed by
simple constraints imposed by online service providers
- …