11 research outputs found
Bengali Fake Review Detection using Semi-supervised Generative Adversarial Networks
This paper investigates the potential of semi-supervised Generative
Adversarial Networks (GANs) to fine-tune pretrained language models in order to
classify Bengali fake reviews from real reviews with a few annotated data. With
the rise of social media and e-commerce, the ability to detect fake or
deceptive reviews is becoming increasingly important in order to protect
consumers from being misled by false information. Any machine learning model
will have trouble identifying a fake review, especially for a low resource
language like Bengali. We have demonstrated that the proposed semi-supervised
GAN-LM architecture (generative adversarial network on top of a pretrained
language model) is a viable solution in classifying Bengali fake reviews as the
experimental results suggest that even with only 1024 annotated samples,
BanglaBERT with semi-supervised GAN (SSGAN) achieved an accuracy of 83.59% and
a f1-score of 84.89% outperforming other pretrained language models -
BanglaBERT generator, Bangla BERT Base and Bangla-Electra by almost 3%, 4% and
10% respectively in terms of accuracy. The experiments were conducted on a
manually labeled food review dataset consisting of total 6014 real and fake
reviews collected from various social media groups. Researchers that are
experiencing difficulty recognizing not just fake reviews but other
classification issues owing to a lack of labeled data may find a solution in
our proposed methodology
Social Media and Fake News Detection using Adversarial Collaboration
The diffusion of fake information on social media networks obscures public perception of events, news, and relevant content. Intentional misleading news may promote negative online experiences and influence societal behavioral changes such as increased anxiety, loneliness, and inadequacy. Adversarial attacks target creating misinformation in online information systems. This behavior can be viewed as an instrument to manipulate the online social media networks for cultural, social, economic, and political gains. A method to test a deep learning model- long short-term memory (LSTM) using adversarial examples generated from a transformer model has been presented. The paper attempts to examine features in machine learning algorithms that propagate fake news. Another goal is to evaluate and compare the usefulness of generative adversarial networks with long-term short-term recurrent neural network algorithms in identifying fake news. A closer look at the mechanisms of implementing adversarial attacks in social media systems helps build robust intelligent systems that can withstand future vulnerabilities
Misinformation Containment Using NLP and Machine Learning: Why the Problem Is Still Unsolved
Despite the increased attention and substantial research into it claiming outstanding successes, the problem of misinformation containment has only been growing in the recent years with not many signs of respite. Misinformation is rapidly changing its latent characteristics and spreading vigorously in a multi-modal fashion, sometimes in a more damaging manner than viruses and other malicious programs on the internet. This chapter examines the existing research in natural language processing and machine learning to stop the spread of misinformation, analyzes why the research has not been practical enough to be incorporated into social media platforms, and provides future research directions. The state-of-the-art feature engineering, approaches, and algorithms used for the problem are expounded in the process
Exploring responsible applications of Synthetic Data to advance Online Safety Research and Development
The use of synthetic data provides an opportunity to accelerate online safety
research and development efforts while showing potential for bias mitigation,
facilitating data storage and sharing, preserving privacy and reducing exposure
to harmful content. However, the responsible use of synthetic data requires
caution regarding anticipated risks and challenges. This short report explores
the potential applications of synthetic data to the domain of online safety,
and addresses the ethical challenges that effective use of the technology may
present
Reinforcement Learning for Generative AI: A Survey
Deep Generative AI has been a long-standing essential topic in the machine
learning community, which can impact a number of application areas like text
generation and computer vision. The major paradigm to train a generative model
is maximum likelihood estimation, which pushes the learner to capture and
approximate the target data distribution by decreasing the divergence between
the model distribution and the target distribution. This formulation
successfully establishes the objective of generative tasks, while it is
incapable of satisfying all the requirements that a user might expect from a
generative model. Reinforcement learning, serving as a competitive option to
inject new training signals by creating new objectives that exploit novel
signals, has demonstrated its power and flexibility to incorporate human
inductive bias from multiple angles, such as adversarial learning,
hand-designed rules and learned reward model to build a performant model.
Thereby, reinforcement learning has become a trending research field and has
stretched the limits of generative AI in both model design and application. It
is reasonable to summarize and conclude advances in recent years with a
comprehensive review. Although there are surveys in different application areas
recently, this survey aims to shed light on a high-level review that spans a
range of application areas. We provide a rigorous taxonomy in this area and
make sufficient coverage on various models and applications. Notably, we also
surveyed the fast-developing large language model area. We conclude this survey
by showing the potential directions that might tackle the limit of current
models and expand the frontiers for generative AI