Detecting Fake Reviews: Just a Matter of Data

Abstract

Along with the ever-increasing portfolio of products online, the incentive for market participants to write fake reviews to gain a competitive edge has increased as well. This article demonstrates the effectiveness of using different combinations of spam detection features to detect fake reviews other than the review-based features typically used. Using a spectrum of feature sets offers greater accuracy in identifying fake reviews than using review-based features only, and using a machine learning algorithm for classification and different amounts of feature sets further elucidates the difference in performance. Results compared by benchmarking show that applying a technique prioritizing feature importance benefits from prioritizing features from multiple feature sets and that creating feature sets based on reviews, reviewers and product data can achieve the greatest accuracy

    Similar works