697 research outputs found

    The paradigm-shift of social spambots: Evidence, theories, and tools for the arms race

    Full text link
    Recent studies in social media spam and automation provide anecdotal argumentation of the rise of a new generation of spambots, so-called social spambots. Here, for the first time, we extensively study this novel phenomenon on Twitter and we provide quantitative evidence that a paradigm-shift exists in spambot design. First, we measure current Twitter's capabilities of detecting the new social spambots. Later, we assess the human performance in discriminating between genuine accounts, social spambots, and traditional spambots. Then, we benchmark several state-of-the-art techniques proposed by the academic literature. Results show that neither Twitter, nor humans, nor cutting-edge applications are currently capable of accurately detecting the new social spambots. Our results call for new approaches capable of turning the tide in the fight against this raising phenomenon. We conclude by reviewing the latest literature on spambots detection and we highlight an emerging common research trend based on the analysis of collective behaviors. Insights derived from both our extensive experimental campaign and survey shed light on the most promising directions of research and lay the foundations for the arms race against the novel social spambots. Finally, to foster research on this novel phenomenon, we make publicly available to the scientific community all the datasets used in this study.Comment: To appear in Proc. 26th WWW, 2017, Companion Volume (Web Science Track, Perth, Australia, 3-7 April, 2017

    ๊ฐœ์ธ ์‚ฌํšŒ๋ง ๋„คํŠธ์›Œํฌ ๋ถ„์„ ๊ธฐ๋ฐ˜ ์˜จ๋ผ์ธ ์‚ฌํšŒ ๊ณต๊ฒฉ์ž ํƒ์ง€

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ)--์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› :๊ณต๊ณผ๋Œ€ํ•™ ์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€,2020. 2. ๊น€์ข…๊ถŒ.In the last decade we have witnessed the explosive growth of online social networking services (SNSs) such as Facebook, Twitter, Weibo and LinkedIn. While SNSs provide diverse benefits โ€“ for example, fostering inter-personal relationships, community formations and news propagation, they also attracted uninvited nuiance. Spammers abuse SNSs as vehicles to spread spams rapidly and widely. Spams, unsolicited or inappropriate messages, significantly impair the credibility and reliability of services. Therefore, detecting spammers has become an urgent and critical issue in SNSs. This paper deals with spamming in Twitter and Weibo. Instead of spreading annoying messages to the public, a spammer follows (subscribes to) normal users, and followed a normal user. Sometimes a spammer makes link farm to increase target accounts explicit influence. Based on the assumption that the online relationships of spammers are different from those of normal users, I proposed classification schemes that detect online social attackers including spammers. I firstly focused on ego-network social relations and devised two features, structural features based on Triad Significance Profile (TSP) and relational semantic features based on hierarchical homophily in an ego-network. Experiments on real Twitter and Weibo datasets demonstrated that the proposed approach is very practical. The proposed features are scalable because instead of analyzing the whole network, they inspect user-centered ego-networks. My performance study showed that proposed methods yield significantly better performance than prior scheme in terms of true positives and false positives.์ตœ๊ทผ ์šฐ๋ฆฌ๋Š” Facebook, Twitter, Weibo, LinkedIn ๋“ฑ์˜ ๋‹ค์–‘ํ•œ ์‚ฌํšŒ ๊ด€๊ณ„๋ง ์„œ๋น„์Šค๊ฐ€ ํญ๋ฐœ์ ์œผ๋กœ ์„ฑ์žฅํ•˜๋Š” ํ˜„์ƒ์„ ๋ชฉ๊ฒฉํ•˜์˜€๋‹ค. ํ•˜์ง€๋งŒ ์‚ฌํšŒ ๊ด€๊ณ„๋ง ์„œ๋น„์Šค๊ฐ€ ๊ฐœ์ธ๊ณผ ๊ฐœ์ธ๊ฐ„์˜ ๊ด€๊ณ„ ๋ฐ ์ปค๋ฎค๋‹ˆํ‹ฐ ํ˜•์„ฑ๊ณผ ๋‰ด์Šค ์ „ํŒŒ ๋“ฑ์˜ ์—ฌ๋Ÿฌ ์ด์ ์„ ์ œ๊ณตํ•ด ์ฃผ๊ณ  ์žˆ๋Š”๋ฐ ๋ฐ˜ํ•ด ๋ฐ˜๊ฐ‘์ง€ ์•Š์€ ํ˜„์ƒ ์—ญ์‹œ ๋ฐœ์ƒํ•˜๊ณ  ์žˆ๋‹ค. ์ŠคํŒจ๋จธ๋“ค์€ ์‚ฌํšŒ ๊ด€๊ณ„๋ง ์„œ๋น„์Šค๋ฅผ ๋™๋ ฅ ์‚ผ์•„ ์ŠคํŒธ์„ ๋งค์šฐ ๋น ๋ฅด๊ณ  ๋„“๊ฒŒ ์ „ํŒŒํ•˜๋Š” ์‹์œผ๋กœ ์•…์šฉํ•˜๊ณ  ์žˆ๋‹ค. ์ŠคํŒธ์€ ์ˆ˜์‹ ์ž๊ฐ€ ์›์น˜ ์•Š๋Š” ๋ฉ”์‹œ์ง€๋“ค์„ ์ผ์ปฝ๋Š”๋ฐ ์ด๋Š” ์„œ๋น„์Šค์˜ ์‹ ๋ขฐ๋„์™€ ์•ˆ์ •์„ฑ์„ ํฌ๊ฒŒ ์†์ƒ์‹œํ‚จ๋‹ค. ๋”ฐ๋ผ์„œ, ์ŠคํŒจ๋จธ๋ฅผ ํƒ์ง€ํ•˜๋Š” ๊ฒƒ์ด ํ˜„์žฌ ์†Œ์…œ ๋ฏธ๋””์–ด์—์„œ ๋งค์šฐ ๊ธด๊ธ‰ํ•˜๊ณ  ์ค‘์š”ํ•œ ๋ฌธ์ œ๊ฐ€ ๋˜์—ˆ๋‹ค. ์ด ๋…ผ๋ฌธ์€ ๋Œ€ํ‘œ์ ์ธ ์‚ฌํšŒ ๊ด€๊ณ„๋ง ์„œ๋น„์Šค๋“ค ์ค‘ Twitter์™€ Weibo์—์„œ ๋ฐœ์ƒํ•˜๋Š” ์ŠคํŒจ๋ฐ์„ ๋‹ค๋ฃจ๊ณ  ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ ์œ ํ˜•์˜ ์ŠคํŒจ๋ฐ๋“ค์€ ๋ถˆํŠน์ • ๋‹ค์ˆ˜์—๊ฒŒ ๋ฉ”์‹œ์ง€๋ฅผ ์ „ํŒŒํ•˜๋Š” ๋Œ€์‹ ์—, ๋งŽ์€ ์ผ๋ฐ˜ ์‚ฌ์šฉ์ž๋“ค์„ 'ํŒ”๋กœ์šฐ(๊ตฌ๋…)'ํ•˜๊ณ  ์ด๋“ค๋กœ๋ถ€ํ„ฐ '๋งž ํŒ”๋กœ์ž‰(๋งž ๊ตฌ๋…)'์„ ์ด๋Œ์–ด ๋‚ด๋Š” ๊ฒƒ์„ ๋ชฉ์ ์œผ๋กœ ํ•˜๊ธฐ๋„ ํ•œ๋‹ค. ๋•Œ๋กœ๋Š” link farm์„ ์ด์šฉํ•ด ํŠน์ • ๊ณ„์ •์˜ ํŒ”๋กœ์›Œ ์ˆ˜๋ฅผ ๋†’์ด๊ณ  ๋ช…์‹œ์  ์˜ํ–ฅ๋ ฅ์„ ์ฆ๊ฐ€์‹œํ‚ค๊ธฐ๋„ ํ•œ๋‹ค. ์ŠคํŒจ๋จธ์˜ ์˜จ๋ผ์ธ ๊ด€๊ณ„๋ง์ด ์ผ๋ฐ˜ ์‚ฌ์šฉ์ž์˜ ์˜จ๋ผ์ธ ์‚ฌํšŒ๋ง๊ณผ ๋‹ค๋ฅผ ๊ฒƒ์ด๋ผ๋Š” ๊ฐ€์ • ํ•˜์—, ๋‚˜๋Š” ์ŠคํŒจ๋จธ๋“ค์„ ํฌํ•จํ•œ ์ผ๋ฐ˜์ ์ธ ์˜จ๋ผ์ธ ์‚ฌํšŒ๋ง ๊ณต๊ฒฉ์ž๋“ค์„ ํƒ์ง€ํ•˜๋Š” ๋ถ„๋ฅ˜ ๋ฐฉ๋ฒ•์„ ์ œ์‹œํ•œ๋‹ค. ๋‚˜๋Š” ๋จผ์ € ๊ฐœ์ธ ์‚ฌํšŒ๋ง ๋‚ด ์‚ฌํšŒ ๊ด€๊ณ„์— ์ฃผ๋ชฉํ•˜๊ณ  ๋‘ ๊ฐ€์ง€ ์ข…๋ฅ˜์˜ ๋ถ„๋ฅ˜ ํŠน์„ฑ์„ ์ œ์•ˆํ•˜์˜€๋‹ค. ์ด๋“ค์€ ๊ฐœ์ธ ์‚ฌํšŒ๋ง์˜ Triad Significance Profile (TSP)์— ๊ธฐ๋ฐ˜ํ•œ ๊ตฌ์กฐ์  ํŠน์„ฑ๊ณผ Hierarchical homophily์— ๊ธฐ๋ฐ˜ํ•œ ๊ด€๊ณ„ ์˜๋ฏธ์  ํŠน์„ฑ์ด๋‹ค. ์‹ค์ œ Twitter์™€ Weibo ๋ฐ์ดํ„ฐ์…‹์— ๋Œ€ํ•œ ์‹คํ—˜ ๊ฒฐ๊ณผ๋Š” ์ œ์•ˆํ•œ ๋ฐฉ๋ฒ•์ด ๋งค์šฐ ์‹ค์šฉ์ ์ด๋ผ๋Š” ๊ฒƒ์„ ๋ณด์—ฌ์ค€๋‹ค. ์ œ์•ˆํ•œ ํŠน์„ฑ๋“ค์€ ์ „์ฒด ๋„คํŠธ์›Œํฌ๋ฅผ ๋ถ„์„ํ•˜์ง€ ์•Š์•„๋„ ๊ฐœ์ธ ์‚ฌํšŒ๋ง๋งŒ ๋ถ„์„ํ•˜๋ฉด ๋˜๊ธฐ ๋•Œ๋ฌธ์— scalableํ•˜๊ฒŒ ์ธก์ •๋  ์ˆ˜ ์žˆ๋‹ค. ๋‚˜์˜ ์„ฑ๋Šฅ ๋ถ„์„ ๊ฒฐ๊ณผ๋Š” ์ œ์•ˆํ•œ ๊ธฐ๋ฒ•์ด ๊ธฐ์กด ๋ฐฉ๋ฒ•์— ๋น„ํ•ด true positive์™€ false positive ์ธก๋ฉด์—์„œ ์šฐ์ˆ˜ํ•˜๋‹ค๋Š” ๊ฒƒ์„ ๋ณด์—ฌ์ค€๋‹ค.1 Introduction 1 2 Related Work 6 2.1 OSN Spammer Detection Approaches 6 2.1.1 Contents-based Approach 6 2.1.2 Social Network-based Approach 7 2.1.3 Subnetwork-based Approach 8 2.1.4 Behavior-based Approach 9 2.2 Link Spam Detection 10 2.3 Data mining schemes for Spammer Detection 10 2.4 Sybil Detection 12 3 Triad Significance Profile Analysis 14 3.1 Motivation 14 3.2 Twitter Dataset 18 3.3 Indegree and Outdegree of Dataset 20 3.4 Twitter spammer Detection with TSP 22 3.5 TSP-Filtering 27 3.6 Performance Evaluation of TSP-Filtering 29 4 Hierarchical Homophily Analysis 33 4.1 Motivation 33 4.2 Hierarchical Homophily in OSN 37 4.2.1 Basic Analysis of Datasets 39 4.2.2 Status gap distribution and Assortativity 44 4.2.3 Hierarchical gap distribution 49 4.3 Performance Evaluation of HH-Filtering 53 5 Overall Performance Evaluation 58 6 Conclusion 63 Bibliography 65Docto

    Probabilistic Matching: Causal Inference under Measurement Errors

    Get PDF
    The abundance of data produced daily from large variety of sources has boosted the need of novel approaches on causal inference analysis from observational data. Observational data often contain noisy or missing entries. Moreover, causal inference studies may require unobserved high-level information which needs to be inferred from other observed attributes. In such cases, inaccuracies of the applied inference methods will result in noisy outputs. In this study, we propose a novel approach for causal inference when one or more key variables are noisy. Our method utilizes the knowledge about the uncertainty of the real values of key variables in order to reduce the bias induced by noisy measurements. We evaluate our approach in comparison with existing methods both on simulated and real scenarios and we demonstrate that our method reduces the bias and avoids false causal inference conclusions in most cases.Comment: In Proceedings of International Joint Conference Of Neural Networks (IJCNN) 201
    • โ€ฆ
    corecore