1,429 research outputs found

    ์‹ค์„ธ๊ณ„ ๊ทธ๋ž˜ํ”„ ํŠน์ง•์„ ํ™œ์šฉํ•œ ๋žœ๋ค ์›Œํฌ ๊ธฐ๋ฐ˜ ๋Œ€๊ทœ๋ชจ ๊ทธ๋ž˜ํ”„ ๋งˆ์ด๋‹

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ)--์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› :๊ณต๊ณผ๋Œ€ํ•™ ์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€,2020. 2. ๊ฐ•์œ .Numerous real-world relationships are represented as graphs such as social networks, hyperlink networks, and protein interaction networks. Analyzing those networks is important to understand the real-life phenomena. Among various graph analysis techniques, random walk has been widely used in many applications with satisfactory results. However, various real-world graphs are large and complicated with diverse labels. Traditional random walk based methods require heavy computational cost, and disregards those labels for performing random walks; thus, its utilization has been limited in such large and complicated graphs. In this thesis, I handle the technical challenges of mining large real-world graphs based on random walk. Real-world graphs have distinct structural properties which become a basis to increase the performance of the random walk in terms of speed and quality. Based upon this idea, I develop fast, scalable, and exact methods for node ranking using random walk in large-scale plain networks. I also design accurate models using random walks for node ranking and relational reasoning in labeled graphs such as signed networks and knowledge bases. Through extensive experiments on various real-world graphs, I demonstrate the effectiveness of the methods and models proposed by this thesis. The proposed methods process 100 times larger graphs, and require up to 130 times less memory with up to 9 times faster speed compared to other existing methods, successfully scaling to billion-scale graphs. Also, the proposed models substantially improve the predictive performance of a variety of tasks in labeled graphs such as signed networks and knowledge bases.๋‹ค์–‘ํ•œ ์‹ค์„ธ๊ณ„ ์ž์—ฐ ํ˜„์ƒ์—์„œ์˜ ๊ด€๊ณ„๋“ค์€ ์†Œ์…œ ๋„คํŠธ์›Œํฌ, ํ•˜์ดํผ๋งํฌ ๋„คํŠธ์›Œํฌ์™€ ๋‹จ๋ฐฑ์งˆ ์ƒํ˜ธ์ž‘์šฉ ๋„คํŠธ์›Œํฌ์™€ ๊ฐ™์ด ์ •์ ๊ณผ ๊ฐ„์„œ์˜ ๊ทธ๋ž˜ํ”„๋กœ ํ‘œํ˜„๋œ๋‹ค. ์ด๋Ÿฌํ•œ ๋„คํŠธ์›Œํฌ๋ฅผ ๋ถ„์„ํ•˜๋Š” ๊ฒƒ์€ ์‹ค์„ธ๊ณ„์˜ ํ˜„์ƒ์„ ์ดํ•ดํ•˜๋Š”๋ฐ ๋งค์šฐ ์ค‘์š”ํ•˜๋‹ค. ๋‹ค์–‘ํ•œ ๊ทธ๋ž˜ํ”„ ๋ถ„์„ ๊ธฐ๋ฒ•์ค‘์— ๋žœ๋ค ์›Œํฌ๋ผ๋Š” ๊ธฐ๋ฒ•์ด ๋งŒ์กฑ์Šค๋Ÿฌ์šด ์„ฑ๋Šฅ๊ณผ ํ•จ๊ป˜ ๋งŽ์€ ๊ทธ๋ž˜ํ”„ ๋งˆ์ด๋‹ ์‘์šฉ์— ๋„๋ฆฌ ํ™œ์šฉ๋˜์–ด ์™”๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋Œ€๋‹ค์ˆ˜์˜ ์‹ค์„ธ๊ณ„ ๊ทธ๋ž˜ํ”„๋Š” ๊ทธ ๊ทœ๋ชจ๊ฐ€ ๊ต‰์žฅํžˆ ํฌ๊ณ  ๋‹ค์–‘ํ•œ ๋ผ๋ฒจ ์ •๋ณด์™€ ํ•จ๊ป˜ ๋ณต์žกํ•˜๊ฒŒ ํ‘œํ˜„๋œ๋‹ค. ์ „ํ†ต์ ์ธ ๋žœ๋ค ์›Œํฌ ๊ธฐ๋ฐ˜์˜ ๊ธฐ๋ฒ•๋“ค์€ ๊ณ„์‚ฐ๋Ÿ‰์ด ๋งŽ์ด ์š”๊ตฌ๋˜๊ณ , ๋žœ๋ค ์›Œํฌ๋ฅผ ํ•˜๋Š”๋ฐ ์žˆ์–ด์„œ ๋‹ค์–‘ํ•œ ๋ผ๋ฒจ ์ •๋ณด๋ฅผ ์ „ํ˜€ ๊ณ ๋ คํ•˜์ง€ ์•Š์•„ ๋ผ๋ฒจ๋กœ ํ‘œํ˜„๋˜๋Š” ๊ทธ๋ž˜ํ”„์˜ ๊ณ ์œ ํ•œ ํŠน์„ฑ์ด ๋ฌด์‹œ๋˜๊ฒŒ ๋œ๋‹ค. ๊ทธ๋ž˜์„œ ์ด์™€ ๊ฐ™์ด ๋ณต์žกํ•˜๋ฉด์„œ ๋Œ€๊ทœ๋ชจ ๊ทธ๋ž˜ํ”„์—์„œ๋Š” ๋žœ๋ค ์›Œํฌ์˜ ์‹ค์งˆ์  ํ™œ์šฉ์ด ์ œํ•œ๋˜์–ด์™”๋‹ค. ๋ณธ ํ•™์œ„ ๋…ผ๋ฌธ์—์„œ๋Š” ๋žœ๋ค ์›Œํฌ ๊ธฐ๋ฐ˜์˜ ๋Œ€๊ทœ๋ชจ ์‹ค์„ธ๊ณ„ ๊ทธ๋ž˜ํ”„ ๋ถ„์„์˜ ๊ธฐ์ˆ ์  ํ•œ๊ณ„๋ฅผ ํ•ด๊ฒฐํ•˜๊ณ ์ž ํ•œ๋‹ค. ์‹ค์„ธ๊ณ„ ๊ทธ๋ž˜ํ”„๋Š” ๊ณ ์œ ํ•œ ๊ตฌ์กฐ์  ํŠน์ง•๋“ค์„ ๊ฐ€์ง€๊ณ  ์žˆ์œผ๋ฉฐ ์ด๋Ÿฌํ•œ ๊ตฌ์กฐ์  ํŠน์ง•๋“ค์€ ์†๋„์™€ ํ’ˆ์งˆ์˜ ์ธก๋ฉด์—์„œ ๋žœ๋ค ์›Œํฌ์˜ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๋Š”๋ฐ ๊ธฐ๋ฐ˜์ด ๋  ์ˆ˜ ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ ์•„์ด๋””์–ด๋ฅผ ํ™œ์šฉํ•˜์—ฌ, ๋Œ€๊ทœ๋ชจ์˜ ๋ผ๋ฒจ์ด ์—†๋Š” ์ผ๋ฐ˜์ ์ธ ๋„คํŠธ์›Œํฌ์—์„œ ๋žœ๋ค ์›Œํฌ ๊ธฐ๋ฐ˜์˜ ๊ฐœ์ธํ™”๋œ ์ •์  ๋žญํ‚น ๊ณ„์‚ฐ์„ ๋น ๋ฅด๊ณ , ํ™•์žฅ์„ฑ ์žˆ๊ณ  ์ •ํ™•ํ•˜๊ฒŒ ๊ตฌํ•˜๋Š” ๊ธฐ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. ๋˜ํ•œ ๋ถ€ํ˜ธํ™”๋œ ๋„คํŠธ์›Œํฌ ๋˜๋Š” ์ง€์‹ ๋ฒ ์ด์Šค์™€ ๊ฐ™์€ ๋ผ๋ฒจ์ด ์žˆ๋Š” ๊ทธ๋ž˜ํ”„์—์„œ ๊ฐœ์ธํ™”๋œ ์ •์  ๋žญํ‚น๊ณผ ๊ด€๊ณ„ ์ถ”๋ก ์„ ์œ„ํ•œ ๋žœ๋ค ์›Œํฌ ๊ธฐ๋ฐ˜์˜ ๋ชจ๋ธ์„ ์ œ์•ˆํ•œ๋‹ค. ๋‹ค์–‘ํ•œ ์‹ค์„ธ๊ณ„ ๊ทธ๋ž˜ํ”„์—์„œ ๊ด‘๋ฒ”์œ„ํ•œ ์‹คํ—˜์„ ํ†ตํ•ด ๋ณธ ํ•™์œ„ ๋…ผ๋ฌธ์— ์˜ํ•ด ์ œ์•ˆ๋œ ๋ฐฉ๋ฒ•๊ณผ ๋ชจ๋ธ์˜ ํšจ๊ณผ์„ฑ์„ ๋ณด์ธ๋‹ค. ์ œ์•ˆํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ๋‹ค๋ฅธ ๊ฒฝ์Ÿ ๊ธฐ๋ฒ•๋“ค๊ณผ ๋น„๊ตํ–ˆ์„ ๋•Œ ์ตœ๋Œ€ 100๋ฐฐ ๋” ํฐ ๊ทธ๋ž˜ํ”„๋ฅผ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ๊ณ , ์ตœ๋Œ€ 130๋ฐฐ ์ ๊ฒŒ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด์„œ, ์ตœ๋Œ€ 9๋ฐฐ ๋น ๋ฅธ ์†๋„๋ฅผ ๋ณด์ด๋ฉฐ, ๊ฒฐ๊ณผ์ ์œผ๋กœ ์ˆ˜ ์‹ญ์–ต ๊ทœ๋ชจ์˜ ๊ทธ๋ž˜ํ”„์—์„œ ๋žœ๋ค ์›Œํฌ ๊ธฐ๋ฐ˜์˜ ๊ฐœ์ธํ™”๋œ ์ •์  ๋žญํ‚น์„ ์„ฑ๊ณต์ ์œผ๋กœ ๊ตฌํ•  ์ˆ˜ ์žˆ๋‹ค. ๋˜ํ•œ, ์ œ์•ˆํ•˜๋Š” ๋žœ๋ค ์›Œํฌ ๊ธฐ๋ฐ˜์˜ ๋ชจ๋ธ๋“ค์€ ๋ถ€ํ˜ธํ™”๋œ ๋„คํŠธ์›Œํฌ์™€ ์ง€์‹ ๋ฒ ์ด์Šค์™€ ๊ฐ™์€ ๋ผ๋ฒจ์ด ์žˆ๋Š” ๊ทธ๋ž˜ํ”„์—์„œ ๋ถ€ํ˜ธ ์˜ˆ์ธก, ๊ฐ„์„  ์˜ˆ์ธก, ์ด์ƒ ํ˜„์ƒ ํƒ์ง€, ๊ด€๊ณ„ ์ถ”๋ก  ๋“ฑ์˜ ๋‹ค์–‘ํ•œ ์‘์šฉ์—์„œ ๋‹ค๋ฅธ ๊ฒฝ์Ÿ ๋ชจ๋ธ๋“ค๋ณด๋‹ค ๋” ์ข‹์€ ์˜ˆ์ธก ์„ฑ๋Šฅ์„ ๋ณด์ธ๋‹ค.Chapter1 Overview .... 1 1.1 Motivation .... 1 1.2 Research Statement .... 4 1.2.1 Research Goals and Importance .... 4 1.2.2 Technical Challenges .... 6 1.2.3 Main Approaches .... 7 1.2.4 Contributions .... 9 1.2.5 Overall Impact .... 10 1.3 Thesis Organization .... 11 Chapter2 Background .... 12 2.1 Definitions .... 12 2.1.1 Notations on Graphs .... 12 2.1.2 Random Walk with Restart .... 13 2.2 Related Works .... 15 2.2.1 Previous Methods for RWR in Plain Graphs .... 15 2.2.2 Ranking Models in Signed Networks .... 17 2.2.3 Relational Reasoning Models in Edge-labeled Graphs .... 19 Chapter 3 Fast and Scalable Ranking in Large-scale Plain Graphs .... 21 3.1 Introduction .... 21 3.2 Preliminaries .... 23 3.2.1 Iterative Methods for RWR .... 24 3.2.2 Preprocessing Methods for RWR .... 25 3.3 Proposed Method .... 26 3.3.1 Overview .... 26 3.3.2 BePI-B: Exploiting Graph Characteristics for Node Reordering and Block Elimination .... 28 3.3.3 BePI-B: Incorporating an Iterative Method into Block Elimination .... 32 3.3.4 BePI-S: Sparsifying the Schur Complement .... 34 3.3.5 BePI: Preconditioning a Linear System for the Iterative Method .... 36 3.4 Theoretical Results .... 39 3.4.1 Time Complexity .... 39 3.4.2 Space Complexity .... 40 3.4.3 Accuracy Bound .... 41 3.4.4 Lemmas and Proofs .... 43 3.5 Experiments .... 48 3.5.1 Experimental Settings .... 49 3.5.2 Preprocessing Cost .... 51 3.5.3 Query Cost .... 53 3.5.4 Scalability .... 53 3.5.5 Effects of Sparse Schur Complement and Preconditioning .... 54 3.5.6 Effects of the Hub Selection Ratio .... 57 3.5.7 Accuracy .... 58 3.5.8 Comparison with the-State-of-the-Art Method .... 59 3.6 Summary .... 60 Chapter 4 Personalized Ranking in Signed Graphs .... 61 4.1 Introduction .... 61 4.2 Problem Definition .... 65 4.3 Proposed Method .... 65 4.3.1 Signed Random Walk with Restart Model .... 66 4.3.2 SRWR-Iter: Iterative Algorithm for Signed Random Walk with Restart .... 76 4.3.3 SRWR-Pre: Preprocessing Algorithm for Signed Random Walk with Restart .... 82 4.4 Experiments .... 93 4.4.1 Experimental Settings .... 94 4.4.2 Link Prediction Task .... 96 4.4.3 User Preference Preservation Task .... 99 4.4.4 Troll Identification Task .... 100 4.4.5 Sign Prediction Task .... 104 4.4.6 Effectiveness of Balance Attenuation Factors .... 109 4.4.7 Performance of SRWR-Pre .... 110 4.5 Summary .... 113 Chapter 5 Relational Reasoning in Edge-labeled Graphs .... 114 5.1 Introduction .... 114 5.2 Preliminary .... 116 5.3 Proposed Method .... 118 5.3.1 Label Transition Observation .... 120 5.3.2 Learning Label Transition Probabilities .... 121 5.3.3 Multi-Labeled Random Walk with Restart .... 123 5.3.4 Formulation for MuRWR .... 125 5.3.5 Algorithm for MuRWR .... 127 5.4 Theoretical Results .... 131 5.4.1 Lemma for Solution of Label Transition Probabilities and Convexity .... 131 5.4.2 Lemma for Recursive Equation of MuRWR Score Matrix .... 134 5.4.3 Lemma for Spectral Radius in Convergence Theorem .... 136 5.4.4 Lemma for Complexity Analysis .... 137 5.5 Experiment .... 138 5.5.1 Experimental Settings .... 139 5.5.2 Relation Inference Task .... 140 5.5.3 Effects of Label Weights in MuRWR .... 142 5.5.4 Effects of Restart Probability in MuRWR .... 143 5.5.5 Convergence of MuRWR .... 144 5.6 Summary .... 145 Chapter6 Future Works .... 146 6.1 Fast and Accurate Pseudoinverse Computation .... 146 6.2 Fast and Scalable Signed Network Generation .... 147 6.3 Disk-based Algorithms for Random Walk .... 147 Chapter7 Conclusion .... 149 References .... 151 Appendix .... 166 A.1 Hub-and-Spoke Reordering Method .... 166 A.2 Time Complexity of Sparse Matrix Multiplication .... 167 A.3 Details of Preconditioned GMRES .... 167 A.4 Detailed Description of Evaluation Metrics .... 170 A.4.1 Link Prediction .... 170 A.4.2 Troll Identification .... 171 A.5 Discussion on Relative Trustworthiness of SRWR .... 173 Abstract in Korean .... 176Docto

    SybilBelief: A Semi-supervised Learning Approach for Structure-based Sybil Detection

    Full text link
    Sybil attacks are a fundamental threat to the security of distributed systems. Recently, there has been a growing interest in leveraging social networks to mitigate Sybil attacks. However, the existing approaches suffer from one or more drawbacks, including bootstrapping from either only known benign or known Sybil nodes, failing to tolerate noise in their prior knowledge about known benign or Sybil nodes, and being not scalable. In this work, we aim to overcome these drawbacks. Towards this goal, we introduce SybilBelief, a semi-supervised learning framework, to detect Sybil nodes. SybilBelief takes a social network of the nodes in the system, a small set of known benign nodes, and, optionally, a small set of known Sybils as input. Then SybilBelief propagates the label information from the known benign and/or Sybil nodes to the remaining nodes in the system. We evaluate SybilBelief using both synthetic and real world social network topologies. We show that SybilBelief is able to accurately identify Sybil nodes with low false positive rates and low false negative rates. SybilBelief is resilient to noise in our prior knowledge about known benign and Sybil nodes. Moreover, SybilBelief performs orders of magnitudes better than existing Sybil classification mechanisms and significantly better than existing Sybil ranking mechanisms.Comment: 12 page

    Recommender Systems

    Get PDF
    The ongoing rapid expansion of the Internet greatly increases the necessity of effective recommender systems for filtering the abundant information. Extensive research for recommender systems is conducted by a broad range of communities including social and computer scientists, physicists, and interdisciplinary researchers. Despite substantial theoretical and practical achievements, unification and comparison of different approaches are lacking, which impedes further advances. In this article, we review recent developments in recommender systems and discuss the major challenges. We compare and evaluate available algorithms and examine their roles in the future developments. In addition to algorithms, physical aspects are described to illustrate macroscopic behavior of recommender systems. Potential impacts and future directions are discussed. We emphasize that recommendation has a great scientific depth and combines diverse research fields which makes it of interests for physicists as well as interdisciplinary researchers.Comment: 97 pages, 20 figures (To appear in Physics Reports

    Trust-based recommendation systems: an axiomatic approach

    Get PDF
    ABSTRACT High-quality, personalized recommendations are a key feature in many online systems. Since these systems often have explicit knowledge of social network structures, the recommendations may incorporate this information. This paper focuses on networks which represent trust and recommendations which incorporate trust relationships. The goal of a trust-based recommendation system is to generate personalized recommendations from known opinions and trust relationships. In analogy to prior work on voting and ranking systems, we use the axiomatic approach from the theory of social choice. We develop an natural set of five axioms which we desire any recommendation system exhibit. Then we show that no system can simultaneously satisfy all these axioms. We also exhibit systems which satisfy any four of the five axioms. Next we consider ways of weakening the axioms, which can lead to a unique recommendation system based on random walks. We consider other recommendation systems (personal page rank, majority of majorities, and min cut) and search for alternative axiomatizations which uniquely characterize these systems. Finally, we determine which of these systems are incentive compatible. This is an important property for systems deployed in a monetized environment: groups of agents interested in manipulating recommendations to make others share their opinion have nothing to gain from lying about their votes or their trust links
    • โ€ฆ
    corecore