1,158 research outputs found

    Fake Review Detection using Data Mining

    Get PDF
    Online spam reviews are deceptive evaluations of products and services. They are often carried out as a deliberate manipulation strategy to deceive the readers. Recognizing such reviews is an important but challenging problem. In this work, I try to solve this problem by using different data mining techniques. I explore the strength and weakness of those data mining techniques in detecting fake review. I start with different supervised techniques such as Support Vector Ma- chine (SVM), Multinomial Naive Bayes (MNB), and Multilayer Perceptron. The results attest that all the above mentioned supervised techniques can successfully detect fake review with more than 86% accuracy. Then, I work on a semi-supervised technique which reduces the dimension- ality of the input features vector but offers similar performance to existing approaches. I use a combination of topic modeling and SVM for the implementation of the semi-supervised tech- nique. I also compare the results with other approaches that consider all the words of a dataset as input features. I found that topic words are enough as input features to get similar accuracy compared to other approaches where researchers consider all the words as input features. At the end, I propose an unsupervised learning approach named as Words Basket Analysis for fake re- view detection. I utilize five Amazon products review dataset for an experiment and report the performance of the proposed on these datasets

    The Social World of Content Abusers in Community Question Answering

    Full text link
    Community-based question answering platforms can be rich sources of information on a variety of specialized topics, from finance to cooking. The usefulness of such platforms depends heavily on user contributions (questions and answers), but also on respecting the community rules. As a crowd-sourced service, such platforms rely on their users for monitoring and flagging content that violates community rules. Common wisdom is to eliminate the users who receive many flags. Our analysis of a year of traces from a mature Q&A site shows that the number of flags does not tell the full story: on one hand, users with many flags may still contribute positively to the community. On the other hand, users who never get flagged are found to violate community rules and get their accounts suspended. This analysis, however, also shows that abusive users are betrayed by their network properties: we find strong evidence of homophilous behavior and use this finding to detect abusive users who go under the community radar. Based on our empirical observations, we build a classifier that is able to detect abusive users with an accuracy as high as 83%.Comment: Published in the proceedings of the 24th International World Wide Web Conference (WWW 2015

    ์ง€๋ฆฌ์  ๊ฑฐ๋ฆฌ ์ •๋ณด๋ฅผ ํ™œ์šฉํ•œ ๊ฐ€์งœ ํŒ”๋กœ์›Œ ๊ตฌ๋งค์ž ์‹๋ณ„ ๋ฐฉ๋ฒ•

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2019. 2. ๊น€์ข…๊ถŒ.The reputation of social media such as Twitter, Facebook, and Instagram now regard as one persons power in real-world. The person who has more friends or followers can influence more individuals. So the influence of users is associated with the number of friends or followers. On the demand of increasing social power, an underground market has emerged where a customer can buy fake followers. The one who purchase fake followers acts vigorously in online social network. Thus, it is hard to distinguish customer from celebrity or cyberstar. Nevertheless, there are unique characteristics of legitimate users that customers or fake followers cannot manipulate such as a small-world property. The small-world property is mainly qualified by the shortest-path and clustering coefficient. In the small-world network, most people are linked by short chains. Existing work has largely focused on extracting relationship features such as indegree, outdegree, status, hub, or authority. Even though these research explored the relationship features to classify abnormal users of fake follower markets, research that utilize the small-world property to detect abnormal users is not studied. In this work, we propose a model that adapt the small-world property. Specifically, we study the geographical distance for 1hop-directional links using nodes geographical location to verify whether a social graph has the small-world property or not. Motivated by the difference of distance ratio for 1hop directional links, we propose a method which is designed to generate 1hop link distance ratio and classify a node as a customer or not. Experimental results on real-world Twitter dataset demonstrates that the proposed method achieves higher performance than existing models.Chapter 1 Introduction 1 1.1 Motivations 1 1.2 Fake Follower Markets 3 1.3 Research Objectives 5 1.4 Contributions 6 1.5 Thesis Organization 8 Chapter 2 Related Work 10 2.1 Small World Phenomenon 10 2.2 Online Social Abusing Attack Detection 11 2.2.1 Contents-based Detection 12 2.2.2 Social Network-based Detection 13 2.2.3 Behavior-based Detection 5 Chapter 3 Characteristic of Customers and Fake Followers 16 3.1 Data Preparation 16 3.2 Fake Follower Properties 21 3.3 Customer Properties 26 Chapter 4 Social Relationship and Geographical Distance 29 4.1 Geographical Distance in OSNs 29 4.2 Follower Ratio 34 Chapter 5 Detecting Customers 38 5.1 Key Features for Customer Detection 38 5.2 Performance matrices 40 5.3 Experiments 41 5.4 Comparison with Baseline Method 44 5.5 Comparison with Feature-based Method 47 5.6 Impact of Balanced Dataset 49 5.7 Fake Follower Detection 50 Chapter 6 Future Work 52 6.1 The Absence of Location Information 52 6.2 Hybrid Detection Method with Link Ratio and Profile Information 54 Chapter 7 Conclusion 56 Bibliography 58 ๊ตญ๋ฌธ์ดˆ๋ก 69Docto

    Predictive Non-equilibrium Social Science

    Full text link
    Non-Equilibrium Social Science (NESS) emphasizes dynamical phenomena, for instance the way political movements emerge or competing organizations interact. This paper argues that predictive analysis is an essential element of NESS, occupying a central role in its scientific inquiry and representing a key activity of practitioners in domains such as economics, public policy, and national security. We begin by clarifying the distinction between models which are useful for prediction and the much more common explanatory models studied in the social sciences. We then investigate a challenging real-world predictive analysis case study, and find evidence that the poor performance of standard prediction methods does not indicate an absence of human predictability but instead reflects (1.) incorrect assumptions concerning the predictive utility of explanatory models, (2.) misunderstanding regarding which features of social dynamics actually possess predictive power, and (3.) practical difficulties exploiting predictive representations.Comment: arXiv admin note: substantial text overlap with arXiv:1212.680
    • โ€ฆ
    corecore