1,158 research outputs found
Fake Review Detection using Data Mining
Online spam reviews are deceptive evaluations of products and services. They are often carried out as a deliberate manipulation strategy to deceive the readers. Recognizing such reviews is an important but challenging problem. In this work, I try to solve this problem by using different data mining techniques. I explore the strength and weakness of those data mining techniques in detecting fake review. I start with different supervised techniques such as Support Vector Ma- chine (SVM), Multinomial Naive Bayes (MNB), and Multilayer Perceptron. The results attest that all the above mentioned supervised techniques can successfully detect fake review with more than 86% accuracy. Then, I work on a semi-supervised technique which reduces the dimension- ality of the input features vector but offers similar performance to existing approaches. I use a combination of topic modeling and SVM for the implementation of the semi-supervised tech- nique. I also compare the results with other approaches that consider all the words of a dataset
as input features. I found that topic words are enough as input features to get similar accuracy compared to other approaches where researchers consider all the words as input features. At the end, I propose an unsupervised learning approach named as Words Basket Analysis for fake re- view detection. I utilize five Amazon products review dataset for an experiment and report the performance of the proposed on these datasets
The Social World of Content Abusers in Community Question Answering
Community-based question answering platforms can be rich sources of
information on a variety of specialized topics, from finance to cooking. The
usefulness of such platforms depends heavily on user contributions (questions
and answers), but also on respecting the community rules. As a crowd-sourced
service, such platforms rely on their users for monitoring and flagging content
that violates community rules.
Common wisdom is to eliminate the users who receive many flags. Our analysis
of a year of traces from a mature Q&A site shows that the number of flags does
not tell the full story: on one hand, users with many flags may still
contribute positively to the community. On the other hand, users who never get
flagged are found to violate community rules and get their accounts suspended.
This analysis, however, also shows that abusive users are betrayed by their
network properties: we find strong evidence of homophilous behavior and use
this finding to detect abusive users who go under the community radar. Based on
our empirical observations, we build a classifier that is able to detect
abusive users with an accuracy as high as 83%.Comment: Published in the proceedings of the 24th International World Wide Web
Conference (WWW 2015
์ง๋ฆฌ์ ๊ฑฐ๋ฆฌ ์ ๋ณด๋ฅผ ํ์ฉํ ๊ฐ์ง ํ๋ก์ ๊ตฌ๋งค์ ์๋ณ ๋ฐฉ๋ฒ
ํ์๋
ผ๋ฌธ (๋ฐ์ฌ)-- ์์ธ๋ํ๊ต ๋ํ์ : ๊ณต๊ณผ๋ํ ์ปดํจํฐ๊ณตํ๋ถ, 2019. 2. ๊น์ข
๊ถ.The reputation of social media such as Twitter, Facebook, and Instagram now regard as one persons power in real-world. The person who has more friends or followers can influence more individuals. So the influence of users is associated with the number of friends or followers. On the demand of increasing social power, an underground market has emerged where a customer can buy fake followers. The one who purchase fake followers acts vigorously in online social network. Thus, it is hard to distinguish customer from celebrity or cyberstar. Nevertheless, there are unique characteristics of legitimate users that customers or fake followers cannot manipulate such as a small-world property. The small-world property is mainly qualified by the shortest-path and clustering coefficient. In the small-world network, most people are linked by short chains. Existing work has largely focused on extracting relationship features such as indegree, outdegree, status, hub, or authority. Even though these research explored the relationship features to classify abnormal users of fake follower markets, research that utilize the small-world property to detect abnormal users is not studied.
In this work, we propose a model that adapt the small-world property. Specifically, we study the geographical distance for 1hop-directional links using nodes geographical location to verify whether a social graph has the small-world property or not. Motivated by the difference of distance ratio for 1hop directional links, we propose a method which is designed to generate 1hop link distance ratio and classify a node as a customer or not. Experimental results on real-world Twitter dataset demonstrates that the proposed method achieves higher performance than existing models.Chapter 1 Introduction 1
1.1 Motivations 1
1.2 Fake Follower Markets 3
1.3 Research Objectives 5
1.4 Contributions 6
1.5 Thesis Organization 8
Chapter 2 Related Work 10
2.1 Small World Phenomenon 10
2.2 Online Social Abusing Attack Detection 11
2.2.1 Contents-based Detection 12
2.2.2 Social Network-based Detection 13
2.2.3 Behavior-based Detection 5
Chapter 3 Characteristic of Customers and Fake Followers 16
3.1 Data Preparation 16
3.2 Fake Follower Properties 21
3.3 Customer Properties 26
Chapter 4 Social Relationship and Geographical Distance 29
4.1 Geographical Distance in OSNs 29
4.2 Follower Ratio 34
Chapter 5 Detecting Customers 38
5.1 Key Features for Customer Detection 38
5.2 Performance matrices 40
5.3 Experiments 41
5.4 Comparison with Baseline Method 44
5.5 Comparison with Feature-based Method 47
5.6 Impact of Balanced Dataset 49
5.7 Fake Follower Detection 50
Chapter 6 Future Work 52
6.1 The Absence of Location Information 52
6.2 Hybrid Detection Method with Link Ratio and Profile Information 54
Chapter 7 Conclusion 56
Bibliography 58
๊ตญ๋ฌธ์ด๋ก 69Docto
Predictive Non-equilibrium Social Science
Non-Equilibrium Social Science (NESS) emphasizes dynamical phenomena, for
instance the way political movements emerge or competing organizations
interact. This paper argues that predictive analysis is an essential element of
NESS, occupying a central role in its scientific inquiry and representing a key
activity of practitioners in domains such as economics, public policy, and
national security. We begin by clarifying the distinction between models which
are useful for prediction and the much more common explanatory models studied
in the social sciences. We then investigate a challenging real-world predictive
analysis case study, and find evidence that the poor performance of standard
prediction methods does not indicate an absence of human predictability but
instead reflects (1.) incorrect assumptions concerning the predictive utility
of explanatory models, (2.) misunderstanding regarding which features of social
dynamics actually possess predictive power, and (3.) practical difficulties
exploiting predictive representations.Comment: arXiv admin note: substantial text overlap with arXiv:1212.680
- โฆ