114 research outputs found

    Leveraging Multi-level Dependency of Relational Sequences for Social Spammer Detection

    Full text link
    Much recent research has shed light on the development of the relation-dependent but content-independent framework for social spammer detection. This is largely because the relation among users is difficult to be altered when spammers attempt to conceal their malicious intents. Our study investigates the spammer detection problem in the context of multi-relation social networks, and makes an attempt to fully exploit the sequences of heterogeneous relations for enhancing the detection accuracy. Specifically, we present the Multi-level Dependency Model (MDM). The MDM is able to exploit user's long-term dependency hidden in their relational sequences along with short-term dependency. Moreover, MDM fully considers short-term relational sequences from the perspectives of individual-level and union-level, due to the fact that the type of short-term sequences is multi-folds. Experimental results on a real-world multi-relational social network demonstrate the effectiveness of our proposed MDM on multi-relational social spammer detection

    ๊ฐœ์ธ ์‚ฌํšŒ๋ง ๋„คํŠธ์›Œํฌ ๋ถ„์„ ๊ธฐ๋ฐ˜ ์˜จ๋ผ์ธ ์‚ฌํšŒ ๊ณต๊ฒฉ์ž ํƒ์ง€

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ)--์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› :๊ณต๊ณผ๋Œ€ํ•™ ์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€,2020. 2. ๊น€์ข…๊ถŒ.In the last decade we have witnessed the explosive growth of online social networking services (SNSs) such as Facebook, Twitter, Weibo and LinkedIn. While SNSs provide diverse benefits โ€“ for example, fostering inter-personal relationships, community formations and news propagation, they also attracted uninvited nuiance. Spammers abuse SNSs as vehicles to spread spams rapidly and widely. Spams, unsolicited or inappropriate messages, significantly impair the credibility and reliability of services. Therefore, detecting spammers has become an urgent and critical issue in SNSs. This paper deals with spamming in Twitter and Weibo. Instead of spreading annoying messages to the public, a spammer follows (subscribes to) normal users, and followed a normal user. Sometimes a spammer makes link farm to increase target accounts explicit influence. Based on the assumption that the online relationships of spammers are different from those of normal users, I proposed classification schemes that detect online social attackers including spammers. I firstly focused on ego-network social relations and devised two features, structural features based on Triad Significance Profile (TSP) and relational semantic features based on hierarchical homophily in an ego-network. Experiments on real Twitter and Weibo datasets demonstrated that the proposed approach is very practical. The proposed features are scalable because instead of analyzing the whole network, they inspect user-centered ego-networks. My performance study showed that proposed methods yield significantly better performance than prior scheme in terms of true positives and false positives.์ตœ๊ทผ ์šฐ๋ฆฌ๋Š” Facebook, Twitter, Weibo, LinkedIn ๋“ฑ์˜ ๋‹ค์–‘ํ•œ ์‚ฌํšŒ ๊ด€๊ณ„๋ง ์„œ๋น„์Šค๊ฐ€ ํญ๋ฐœ์ ์œผ๋กœ ์„ฑ์žฅํ•˜๋Š” ํ˜„์ƒ์„ ๋ชฉ๊ฒฉํ•˜์˜€๋‹ค. ํ•˜์ง€๋งŒ ์‚ฌํšŒ ๊ด€๊ณ„๋ง ์„œ๋น„์Šค๊ฐ€ ๊ฐœ์ธ๊ณผ ๊ฐœ์ธ๊ฐ„์˜ ๊ด€๊ณ„ ๋ฐ ์ปค๋ฎค๋‹ˆํ‹ฐ ํ˜•์„ฑ๊ณผ ๋‰ด์Šค ์ „ํŒŒ ๋“ฑ์˜ ์—ฌ๋Ÿฌ ์ด์ ์„ ์ œ๊ณตํ•ด ์ฃผ๊ณ  ์žˆ๋Š”๋ฐ ๋ฐ˜ํ•ด ๋ฐ˜๊ฐ‘์ง€ ์•Š์€ ํ˜„์ƒ ์—ญ์‹œ ๋ฐœ์ƒํ•˜๊ณ  ์žˆ๋‹ค. ์ŠคํŒจ๋จธ๋“ค์€ ์‚ฌํšŒ ๊ด€๊ณ„๋ง ์„œ๋น„์Šค๋ฅผ ๋™๋ ฅ ์‚ผ์•„ ์ŠคํŒธ์„ ๋งค์šฐ ๋น ๋ฅด๊ณ  ๋„“๊ฒŒ ์ „ํŒŒํ•˜๋Š” ์‹์œผ๋กœ ์•…์šฉํ•˜๊ณ  ์žˆ๋‹ค. ์ŠคํŒธ์€ ์ˆ˜์‹ ์ž๊ฐ€ ์›์น˜ ์•Š๋Š” ๋ฉ”์‹œ์ง€๋“ค์„ ์ผ์ปฝ๋Š”๋ฐ ์ด๋Š” ์„œ๋น„์Šค์˜ ์‹ ๋ขฐ๋„์™€ ์•ˆ์ •์„ฑ์„ ํฌ๊ฒŒ ์†์ƒ์‹œํ‚จ๋‹ค. ๋”ฐ๋ผ์„œ, ์ŠคํŒจ๋จธ๋ฅผ ํƒ์ง€ํ•˜๋Š” ๊ฒƒ์ด ํ˜„์žฌ ์†Œ์…œ ๋ฏธ๋””์–ด์—์„œ ๋งค์šฐ ๊ธด๊ธ‰ํ•˜๊ณ  ์ค‘์š”ํ•œ ๋ฌธ์ œ๊ฐ€ ๋˜์—ˆ๋‹ค. ์ด ๋…ผ๋ฌธ์€ ๋Œ€ํ‘œ์ ์ธ ์‚ฌํšŒ ๊ด€๊ณ„๋ง ์„œ๋น„์Šค๋“ค ์ค‘ Twitter์™€ Weibo์—์„œ ๋ฐœ์ƒํ•˜๋Š” ์ŠคํŒจ๋ฐ์„ ๋‹ค๋ฃจ๊ณ  ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ ์œ ํ˜•์˜ ์ŠคํŒจ๋ฐ๋“ค์€ ๋ถˆํŠน์ • ๋‹ค์ˆ˜์—๊ฒŒ ๋ฉ”์‹œ์ง€๋ฅผ ์ „ํŒŒํ•˜๋Š” ๋Œ€์‹ ์—, ๋งŽ์€ ์ผ๋ฐ˜ ์‚ฌ์šฉ์ž๋“ค์„ 'ํŒ”๋กœ์šฐ(๊ตฌ๋…)'ํ•˜๊ณ  ์ด๋“ค๋กœ๋ถ€ํ„ฐ '๋งž ํŒ”๋กœ์ž‰(๋งž ๊ตฌ๋…)'์„ ์ด๋Œ์–ด ๋‚ด๋Š” ๊ฒƒ์„ ๋ชฉ์ ์œผ๋กœ ํ•˜๊ธฐ๋„ ํ•œ๋‹ค. ๋•Œ๋กœ๋Š” link farm์„ ์ด์šฉํ•ด ํŠน์ • ๊ณ„์ •์˜ ํŒ”๋กœ์›Œ ์ˆ˜๋ฅผ ๋†’์ด๊ณ  ๋ช…์‹œ์  ์˜ํ–ฅ๋ ฅ์„ ์ฆ๊ฐ€์‹œํ‚ค๊ธฐ๋„ ํ•œ๋‹ค. ์ŠคํŒจ๋จธ์˜ ์˜จ๋ผ์ธ ๊ด€๊ณ„๋ง์ด ์ผ๋ฐ˜ ์‚ฌ์šฉ์ž์˜ ์˜จ๋ผ์ธ ์‚ฌํšŒ๋ง๊ณผ ๋‹ค๋ฅผ ๊ฒƒ์ด๋ผ๋Š” ๊ฐ€์ • ํ•˜์—, ๋‚˜๋Š” ์ŠคํŒจ๋จธ๋“ค์„ ํฌํ•จํ•œ ์ผ๋ฐ˜์ ์ธ ์˜จ๋ผ์ธ ์‚ฌํšŒ๋ง ๊ณต๊ฒฉ์ž๋“ค์„ ํƒ์ง€ํ•˜๋Š” ๋ถ„๋ฅ˜ ๋ฐฉ๋ฒ•์„ ์ œ์‹œํ•œ๋‹ค. ๋‚˜๋Š” ๋จผ์ € ๊ฐœ์ธ ์‚ฌํšŒ๋ง ๋‚ด ์‚ฌํšŒ ๊ด€๊ณ„์— ์ฃผ๋ชฉํ•˜๊ณ  ๋‘ ๊ฐ€์ง€ ์ข…๋ฅ˜์˜ ๋ถ„๋ฅ˜ ํŠน์„ฑ์„ ์ œ์•ˆํ•˜์˜€๋‹ค. ์ด๋“ค์€ ๊ฐœ์ธ ์‚ฌํšŒ๋ง์˜ Triad Significance Profile (TSP)์— ๊ธฐ๋ฐ˜ํ•œ ๊ตฌ์กฐ์  ํŠน์„ฑ๊ณผ Hierarchical homophily์— ๊ธฐ๋ฐ˜ํ•œ ๊ด€๊ณ„ ์˜๋ฏธ์  ํŠน์„ฑ์ด๋‹ค. ์‹ค์ œ Twitter์™€ Weibo ๋ฐ์ดํ„ฐ์…‹์— ๋Œ€ํ•œ ์‹คํ—˜ ๊ฒฐ๊ณผ๋Š” ์ œ์•ˆํ•œ ๋ฐฉ๋ฒ•์ด ๋งค์šฐ ์‹ค์šฉ์ ์ด๋ผ๋Š” ๊ฒƒ์„ ๋ณด์—ฌ์ค€๋‹ค. ์ œ์•ˆํ•œ ํŠน์„ฑ๋“ค์€ ์ „์ฒด ๋„คํŠธ์›Œํฌ๋ฅผ ๋ถ„์„ํ•˜์ง€ ์•Š์•„๋„ ๊ฐœ์ธ ์‚ฌํšŒ๋ง๋งŒ ๋ถ„์„ํ•˜๋ฉด ๋˜๊ธฐ ๋•Œ๋ฌธ์— scalableํ•˜๊ฒŒ ์ธก์ •๋  ์ˆ˜ ์žˆ๋‹ค. ๋‚˜์˜ ์„ฑ๋Šฅ ๋ถ„์„ ๊ฒฐ๊ณผ๋Š” ์ œ์•ˆํ•œ ๊ธฐ๋ฒ•์ด ๊ธฐ์กด ๋ฐฉ๋ฒ•์— ๋น„ํ•ด true positive์™€ false positive ์ธก๋ฉด์—์„œ ์šฐ์ˆ˜ํ•˜๋‹ค๋Š” ๊ฒƒ์„ ๋ณด์—ฌ์ค€๋‹ค.1 Introduction 1 2 Related Work 6 2.1 OSN Spammer Detection Approaches 6 2.1.1 Contents-based Approach 6 2.1.2 Social Network-based Approach 7 2.1.3 Subnetwork-based Approach 8 2.1.4 Behavior-based Approach 9 2.2 Link Spam Detection 10 2.3 Data mining schemes for Spammer Detection 10 2.4 Sybil Detection 12 3 Triad Significance Profile Analysis 14 3.1 Motivation 14 3.2 Twitter Dataset 18 3.3 Indegree and Outdegree of Dataset 20 3.4 Twitter spammer Detection with TSP 22 3.5 TSP-Filtering 27 3.6 Performance Evaluation of TSP-Filtering 29 4 Hierarchical Homophily Analysis 33 4.1 Motivation 33 4.2 Hierarchical Homophily in OSN 37 4.2.1 Basic Analysis of Datasets 39 4.2.2 Status gap distribution and Assortativity 44 4.2.3 Hierarchical gap distribution 49 4.3 Performance Evaluation of HH-Filtering 53 5 Overall Performance Evaluation 58 6 Conclusion 63 Bibliography 65Docto

    Collective Multi-relational Network Mining

    Get PDF
    Our world is becoming increasingly interconnected, and the study of networks and graphs are becoming more important than ever. Domains such as biological and pharmaceutical networks, online social networks, the World Wide Web, recommender systems, and scholarly networks are just a few examples that include explicit or implicit network structures. Most networks are formed between different types of nodes and contain different types of links. Leveraging these multi-relational and heterogeneous structures is an important factor in developing better models for these real-world networks. Another important aspect of developing models for network data to make predictions about entities such as nodes or links, is the connections between such entities. These connections invalidate the i.i.d. assumptions about the data in most traditional machine learning methods. Hence, unlike models for non-network data where predictions about entities are made independently of each other, the inter-connectivity of the entities in networks should cause the inferred information about one entity to change the models belief about other related entities. In this dissertation, I present models that can effectively leverage the multi-relational nature of networks and collectively make predictions on links and nodes. In both tasks, I empirically show the importance of considering the multi-relational characteristics and collective predictions. In the first part, I present models to make predictions on nodes by leveraging the graph structure, links generation sequence, and making collective predictions. I apply the node classification methods to detect social spammers in evolving multi-relational social networks and show their effectiveness in identifying spammers without the need of using the textual content. In the second part, I present a generalized augmented multi-relational bi-typed network. I then propose a template for link inference models on these networks and show their application in pharmaceutical discoveries and recommender systems. In the third part, I show that my proposed collective link prediction model is an instance of a general graph-based prediction model that relies on a neighborhood graph for predictions. I then propose a framework that can dynamically adapt the neighborhood graph based on the state of variables from intermediate inference results, as well as structural properties of the relations connecting them to improve the predictive performance of the model

    Follow spam detection based on Cascaded Social Information

    Get PDF
    In the last decade we have witnessed the explosive growth of online social networking services (SNSs) such as Facebook, Twitter, RenRen and LinkedIn. While SNSs provide diverse benefits for example, forstering inter-personal relationships, community formations and news propagation, they also attracted uninvited nuiance. Spammers abuse SNSs as vehicles to spread spams rapidly and widely. Spams, unsolicited or inappropriate messages, significantly impair the credibility and reliability of services. Therefore, detecting spammers has become an urgent and critical issue in SNSs. This paper deals with Follow spam in Twitter. Instead of spreading annoying messages to the public, a spammer follows (subscribes to) legitimate users, and followed a legitimate user. Based on the assumption that the online relationships of spammers are different from those of legitimate users, we proposed classification schemes that detect follow spammers. Particularly, we focused on cascaded social relations and devised two schemes, TSP-Filtering and SS-Filtering, each of which utilizes Triad Significance Profile (TSP) and Social status (SS) in a two-hop subnetwork centered at each other. We also propose an emsemble technique, Cascaded-Filtering, that combine both TSP and SS properties. Our experiments on real Twitter datasets demonstrated that the proposed three approaches are very practical. The proposed schemes are scalable because instead of analyzing the whole network, they inspect user-centered two hop social networks. Our performance study showed that proposed methods yield significantly better performance than prior scheme in terms of true positives and false positives.OAIID:RECH_ACHV_DSTSH_NO:T201620357RECH_ACHV_FG:RR00200001ADJUST_YN:EMP_ID:A001118CITE_RATE:3.364FILENAME:Follow spam detection based on cascaded social information.pdfDEPT_NM:์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€EMAIL:[email protected]_YN:YFILEURL:https://srnd.snu.ac.kr/eXrepEIR/fws/file/be43ae94-4659-467d-bc0f-17dc45d3e775/linkCONFIRM:
    • โ€ฆ
    corecore