56 research outputs found

    Identification of Opinion Spammers using Reviewer Reputation and Clustering Analysis

    Get PDF
    Online reviews have increasingly become a very important resource before making a purchasing decisions. Unfortunately, malicious sellers try to game the system by hiring a person or team (which is called spammers) to fabricate fake reviews to improve their reputation.Existing methods mainly take the problem as a general binary classification or focus on some heuristic rules. However, supervised learning methods relies heavily on a large number of labeled examples of deceptive and truthful opinions by domain experts, and most of features mentioned in the heuristic strategy ignore the characteristic of the group organization among spammers. In this paper, an effective method of identifying opinion spammers is proposed. Firstly, suspected spammers are detected by means of unsupervised learning based on reviewer’s reputation. We believe that the reviewer’s reputation has a direct relation with the quality of reviews. Generally, review written by user with lower reputation, shows lower quality and higher possibility to be fake. Therefore, the model assigns reputation score to each reviewer wherein the content based factors and activeness of reviewers are employed efficiently. On basis of all suspected spammers, k-center clustering algorithm is performed to further spot the spammers based on the observation of burst of review release time. Experimental results on Amazon’s dataset are encouraging and indicate that our approach poses high accuracy and recall, and good performance is achieved

    Survey of review spam detection using machine learning techniques

    Get PDF

    Opinion spam detection: using multi-iterative graph-based model

    Get PDF
    The demand to detect opinionated spam, using opinion mining applications to prevent their damaging effects on e-commerce reputations is on the rise in many business sectors globally. The existing spam detection techniques in use nowadays, only consider one or two types of spam entities such as review, reviewer, group of reviewers, and product. Besides, they use a limited number of features related to behaviour, content and the relation of entities which reduces the detection's accuracy. Accordingly, these techniques mostly exploit synthetic datasets to analyse their model and are not able to be applied in the context of the real-world environment. As such, a novel graph-based model called “Multi-iterative Graph-based opinion Spam Detection” (MGSD) in which all various types of entities are considered simultaneously within a unified structure is proposed. Using this approach, the model reveals both implicit (i.e., similar entity's) and explicit (i.e., different entities’) relationships. The MGSD model is able to evaluate the ‘spamicity’ effects of entities more efficiently given it applies a novel multi-iterative algorithm which considers different sets of factors to update the spamicity score of entities. To enhance the accuracy of the MGSD detection model, a higher number of existing weighted features along with the novel proposed features from different categories were selected using a combination of feature fusion techniques and machine learning (ML) algorithms. The MGSD model can also be generalised and applied in various opinionated documents due to employing domain independent features. The output of the MGSD model showed that our feature selection and feature fusion techniques showed a remarkable improvement in detecting spam. The findings of this study showed that MGSD could improve the accuracy of state-of-the-art ML and graph-based techniques by around 5.6% and 4.8%, respectively, also achieving an accuracy of 93% for the detection of spam detection in our synthetic crowdsourced dataset and 95.3% for Ott's crowdsourced dataset

    Man vs machine – Detecting deception in online reviews

    Get PDF
    This study focused on three main research objectives: analyzing the methods used to identify deceptive online consumer reviews, evaluating insights provided by multi-method automated approaches based on individual and aggregated review data, and formulating a review interpretation framework for identifying deception. The theoretical framework is based on two critical deception-related models, information manipulation theory and self-presentation theory. The findings confirm the interchangeable characteristics of the various automated text analysis methods in drawing insights about review characteristics and underline their significant complementary aspects. An integrative multi-method model that approaches the data at the individual and aggregate level provides more complex insights regarding the quantity and quality of review information, sentiment, cues about its relevance and contextual information, perceptual aspects, and cognitive material

    A survey on opinion spam detection methods

    Get PDF
    Since the past decade, fake Reviews also known as Opinion spam has plagued the e-commerce sector around the world. Opinion spam is considered extremely harmful as it can be used to control the sentiment of a product or service, which in turn can be used to damage the sales and reputation of a company. Throughout the years, extensive research has used Natural language processing for extracting textual features and use them with various machine learning algorithms for opinion spam detection. Majority of the reviewed literature has focused on supervised learning techniques using artificially crafted datasets. The purpose of this paper is twofold: to analyze the various machine learning techniques that have been proposed in the extant literature for detecting opinion spam and compare their accuracies, to provide further insights for future researchers in the field of opinion spam detection. This survey has concluded that semi-supervised techniques using multi-aspect features of reviews, reviewers, and products can provide a better result in spam detection. Furthermore, the lack of accurately labeled datasets presents a major challenge in the field of Fake review detection

    Improved Techniques for Online Review Spam Detection

    Get PDF
    The rapid upsurge in the number of e-commerce websites, has made the internet, an extensive source of product reviews. Since there is no scrutiny regarding the quality of the review written, anyone can basically write anything which conclusively leads to Review Spams. There has been an advance in the number of Deceptive Review Spams - fictitious reviews that have been deliberately fabricated to seem genuine. In this work, we have delved into both supervised as well as unsupervised methodologies to identify Review Spams. Improved techniques have been proposed to assemble the most effective feature set for model building. Sentiment Analysis and its results have also been integrated into the spam review detection. Some well known classifiers have been used on the tagged dataset in order to get the best performance. We have also used clustering approach on an unlabelled Amazon reviews dataset. From our results, we compute the most decisive and crucial attributes which lead us to the detection of spam and spammers. We also suggest various practices that could be incorporated by websites in order to detect Review Spams
    corecore