49 research outputs found
Fake review detection using time series
Today’s e-commerce is highly depended on online customers’ reviews posted in opinion sharing websites that are growing incredibly. These reviews are important not only effect on potential customers’ purchase decision but also for manufacturers and business holders to reshape and customize their products and manage competition with rivals throughout the market place. Moreover opinion mining techniques that analyze customer reviews obtained from opinion sharing websites for different purposes could not reveal accurate results for combination of spam reviews and truthful reviews in datasets. Thus employing review spam detection techniques in review websites are highly essential in order to provide reliable resources for customers, manufacturers and researchers. This study aims to detect spam reviews using time series. To achieve this, the novel proposed method detects suspicious time intervals with high number of reviews. Then a combination of three features, i.e. rating of reviews, similarity percentage of review contexts and number of other reviews written by the reviewer of current review, will be used to score each review. Finally a threshold defined for total scores assigned to reviews will be the border line between spam and genuine reviews. Evaluation of obtained results reveals that the proposed method is highly effective in distinguishing spam and non-spam reviews. Furthermore combination of all features used in this research exposed the best results. This fact represents the effectiveness of each feature
Detecting Singleton Review Spammers Using Semantic Similarity
Online reviews have increasingly become a very important resource for
consumers when making purchases. Though it is becoming more and more difficult
for people to make well-informed buying decisions without being deceived by
fake reviews. Prior works on the opinion spam problem mostly considered
classifying fake reviews using behavioral user patterns. They focused on
prolific users who write more than a couple of reviews, discarding one-time
reviewers. The number of singleton reviewers however is expected to be high for
many review websites. While behavioral patterns are effective when dealing with
elite users, for one-time reviewers, the review text needs to be exploited. In
this paper we tackle the problem of detecting fake reviews written by the same
person using multiple names, posting each review under a different name. We
propose two methods to detect similar reviews and show the results generally
outperform the vectorial similarity measures used in prior works. The first
method extends the semantic similarity between words to the reviews level. The
second method is based on topic modeling and exploits the similarity of the
reviews topic distributions using two models: bag-of-words and
bag-of-opinion-phrases. The experiments were conducted on reviews from three
different datasets: Yelp (57K reviews), Trustpilot (9K reviews) and Ott dataset
(800 reviews).Comment: 6 pages, WWW 201
SRC Model to Identify Beguiling Reviews
Today, e-trade sites are giving colossal number of a platform to clients in which they can express their perspectives, their suppositions and post their audits about the items on the web. Such substance helped by clients is accessible for different clients and makers as a significant wellspring of data. This data is useful in taking imperative business choices. Despite the fact that this data impact the purchasing choice of a client, however quality control on this client created information is not guaranteed, as audit area is an open stage accessible to all. anybody can compose anything on web which may incorporate surveys which are not true. as the prevalence of e-commerce destinations are hugely expanding, nature of the surveys is deteriorating step by step subsequently influencing clients’ purchasing choices. This has turned into an enormous social issue. From numerous years, email spam and web spam were the two primary highlighted social issues. at the same time these days, because of notoriety of clients’ enthusiasm toward internet shopping and their reliance on the online audits, it turned into a real focus for audit spammers to delude clients by composing sham surveys for target items. To the best of our insight, very little study is accounted for in regards to this issue reliability of online reviews. To begin with paper was distributed in 2007 by NITIN JINDAL & BING LIU in regards to review Spam detection. In the past few years, variety of techniques has been recommended by researchers to accord with this trouble. This paper intends to introduce Suspicious review Classifier model (SrC) for identifying suspicious review, review spammers and their group
Search Rank Fraud De-Anonymization in Online Systems
We introduce the fraud de-anonymization problem, that goes beyond fraud
detection, to unmask the human masterminds responsible for posting search rank
fraud in online systems. We collect and study search rank fraud data from
Upwork, and survey the capabilities and behaviors of 58 search rank fraudsters
recruited from 6 crowdsourcing sites. We propose Dolos, a fraud
de-anonymization system that leverages traits and behaviors extracted from
these studies, to attribute detected fraud to crowdsourcing site fraudsters,
thus to real identities and bank accounts. We introduce MCDense, a min-cut
dense component detection algorithm to uncover groups of user accounts
controlled by different fraudsters, and leverage stylometry and deep learning
to attribute them to crowdsourcing site profiles. Dolos correctly identified
the owners of 95% of fraudster-controlled communities, and uncovered fraudsters
who promoted as many as 97.5% of fraud apps we collected from Google Play. When
evaluated on 13,087 apps (820,760 reviews), which we monitored over more than 6
months, Dolos identified 1,056 apps with suspicious reviewer groups. We report
orthogonal evidence of their fraud, including fraud duplicates and fraud
re-posts.Comment: The 29Th ACM Conference on Hypertext and Social Media, July 201