3 research outputs found
A Multi-type Classifier Ensemble for Detecting Fake Reviews Through Textualbased Feature Extraction
The financial impact of online reviews has prompted some fraudulent sellers to generate fake consumer reviews for either promoting their products or discrediting competing products. In this study, we propose a novel ensemble model - the Multitype Classifier Ensemble (MtCE) - combined with a textual-based featuring method, which is relatively independent of the system, to detect fake online consumer reviews. Unlike other ensemble models that utilise only the same type of single classifier, our proposed ensemble utilises several customised machine learning classifiers (including deep learning models) as
its base classifiers. The results of our experiments show that the MtCE can adequately detect fake reviews, and that it outperforms other single and ensemble methods in terms of accuracy and other measurements in all the relevant public datasets used in this study. Moreover, if set correctly, the parameters of MtCE, such as base-classifier types, the total number of base classifiers, bootstrap and the method to vote on output (e.g., majority or priority), further improve the performance of the proposed ensemble
A Multilingual Spam Reviews Detection Based on Pre-Trained Word Embedding and Weighted Swarm Support Vector Machines
Online reviews are important information that customers seek when deciding to buy products or
services. Also, organizations benefit from these reviews as essential feedback for their products or services.
Such information required reliability, especially during the Covid-19 pandemic which showed a massive
increase in online reviews due to quarantine and sitting at home. Not only the number of reviews was boosted
but also the context and preferences during the pandemic. Therefore, spam reviewers reflect on these changes
and improve their deception technique. Spam reviews usually consist of misleading, fake, or fraudulent
reviews that tend to deceive customers for the purpose of making money or causing harm to other competitors.
Hence, this work presents a Weighted Support Vector Machine (WSVM) and Harris Hawks Optimization
(HHO) for spam review detection. The HHO works as an algorithm for optimizing hyperparameters and
feature weighting. Three different language corpora have been used as datasets, namely English, Spanish, and
Arabic in order to solve the multilingual problem in spam reviews. Moreover, pre-trained word embedding
(BERT) has been applied alongside three-word representation methods (NGram-3, TFIDF, and One-hot
encoding). Four experiments have been conducted, each focused on solving and demonstrating different
aspects. In all experiments, the proposed approach showed excellent results compared with other state-ofthe-
art algorithms. In other words, the WSVM-HHO achieved an accuracy of 88.163%, 71.913%, 89.565%,
and 84.270%, for English, Spanish, Arabic, and Multilingual datasets, respectively. Further, a deep analysis
has been conducted to investigate the context of reviews before and after the COVID-19 situation. In addition,
it has been generated to create a new dataset with statistical features and merge its previous textual features
for improving detection performance.Projects TED2021-129938B-I0,PID2020-113462RB-I00, PDC2022-133900-I00PID2020-115570GB-C22, granted by Ministerio Español de Ciencia e InnovaciónMCIN/AEI/10.13039/501100011033MCIN/AEI/10.13039/501100011033MCIN/AEINext GenerationEU/PRT
Spam Reviews Detection in the Time of COVID-19 Pandemic: Background, Definitions, Methods and Literature Analysis
This work has been partially funded by projects PID2020-113462RB-I00 (ANIMALICOS), granted by Ministerio Espanol de Economia y Competitividad; projects P18-RT-4830 and A-TIC-608-UGR20 granted by Junta de Andalucia, and project B-TIC-402-UGR18 (FEDER and Junta de Andalucia).During the recent COVID-19 pandemic, people were forced to stay at home to protect
their own and others’ lives. As a result, remote technology is being considered more in all aspects
of life. One important example of this is online reviews, where the number of reviews increased
promptly in the last two years according to Statista and Rize reports. People started to depend more
on these reviews as a result of the mandatory physical distance employed in all countries. With no
one speaking to about products and services feedback. Reading and posting online reviews becomes
an important part of discussion and decision-making, especially for individuals and organizations.
However, the growth of online reviews usage also provoked an increase in spam reviews. Spam
reviews can be identified as fraud, malicious and fake reviews written for the purpose of profit
or publicity. A number of spam detection methods have been proposed to solve this problem. As
part of this study, we outline the concepts and detection methods of spam reviews, along with
their implications in the environment of online reviews. The study addresses all the spam reviews
detection studies for the years 2020 and 2021. In other words, we analyze and examine all works
presented during the COVID-19 situation. Then, highlight the differences between the works before
and after the pandemic in terms of reviews behavior and research findings. Furthermore, nine
different detection approaches have been classified in order to investigate their specific advantages,
limitations, and ways to improve their performance. Additionally, a literature analysis, discussion,
and future directions were also presented.Spanish Government PID2020-113462RB-I00Junta de Andalucia P18-RT-4830
A-TIC-608-UGR20
B-TIC-402-UGR18European Commission B-TIC-402-UGR1