248 research outputs found

    큰 κ·Έλž˜ν”„ μƒμ—μ„œμ˜ κ°œμΈν™”λœ νŽ˜μ΄μ§€ λž­ν¬μ— λŒ€ν•œ λΉ λ₯Έ 계산 기법

    Get PDF
    ν•™μœ„λ…Όλ¬Έ (박사) -- μ„œμšΈλŒ€ν•™κ΅ λŒ€ν•™μ› : κ³΅κ³ΌλŒ€ν•™ 전기·컴퓨터곡학뢀, 2020. 8. 이상ꡬ.Computation of Personalized PageRank (PPR) in graphs is an important function that is widely utilized in myriad application domains such as search, recommendation, and knowledge discovery. Because the computation of PPR is an expensive process, a good number of innovative and efficient algorithms for computing PPR have been developed. However, efficient computation of PPR within very large graphs with over millions of nodes is still an open problem. Moreover, previously proposed algorithms cannot handle updates efficiently, thus, severely limiting their capability of handling dynamic graphs. In this paper, we present a fast converging algorithm that guarantees high and controlled precision. We improve the convergence rate of traditional Power Iteration method by adopting successive over-relaxation, and initial guess revision, a vector reuse strategy. The proposed method vastly improves on the traditional Power Iteration in terms of convergence rate and computation time, while retaining its simplicity and strictness. Since it can reuse the previously computed vectors for refreshing PPR vectors, its update performance is also greatly enhanced. Also, since the algorithm halts as soon as it reaches a given error threshold, we can flexibly control the trade-off between accuracy and time, a feature lacking in both sampling-based approximation methods and fully exact methods. Experiments show that the proposed algorithm is at least 20 times faster than the Power Iteration and outperforms other state-of-the-art algorithms.κ·Έλž˜ν”„ λ‚΄μ—μ„œ κ°œμΈν™”λœ νŽ˜μ΄μ§€λž­ν¬ (P ersonalized P age R ank, PPR λ₯Ό κ³„μ‚°ν•˜λŠ” 것은 검색 , μΆ”μ²œ , μ§€μ‹λ°œκ²¬ λ“± μ—¬λŸ¬ λΆ„μ•Όμ—μ„œ κ΄‘λ²”μœ„ν•˜κ²Œ ν™œμš©λ˜λŠ” μ€‘μš”ν•œ μž‘μ—… 이닀 . κ°œμΈν™”λœ νŽ˜μ΄μ§€λž­ν¬λ₯Ό κ³„μ‚°ν•˜λŠ” 것은 κ³ λΉ„μš©μ˜ 과정이 ν•„μš”ν•˜λ―€λ‘œ , κ°œμΈν™”λœ νŽ˜μ΄μ§€λž­ν¬λ₯Ό κ³„μ‚°ν•˜λŠ” 효율적이고 ν˜μ‹ μ μΈ 방법듀이 λ‹€μˆ˜ κ°œλ°œλ˜μ–΄μ™”λ‹€ . κ·ΈλŸ¬λ‚˜ 수백만 μ΄μƒμ˜ λ…Έλ“œλ₯Ό 가진 λŒ€μš©λŸ‰ κ·Έλž˜ν”„μ— λŒ€ν•œ 효율적인 계산은 μ—¬μ „νžˆ ν•΄κ²°λ˜μ§€ μ•Šμ€ λ¬Έμ œμ΄λ‹€ . 그에 λ”ν•˜μ—¬ , κΈ°μ‘΄ μ œμ‹œλœ μ•Œκ³ λ¦¬λ“¬λ“€μ€ κ·Έλž˜ν”„ 갱신을 효율적으둜 닀루지 λͺ»ν•˜μ—¬ λ™μ μœΌλ‘œ λ³€ν™”ν•˜λŠ” κ·Έλž˜ν”„λ₯Ό λ‹€λ£¨λŠ” 데에 ν•œκ³„μ μ΄ 크닀 . λ³Έ μ—°κ΅¬μ—μ„œλŠ” 높은 정밀도λ₯Ό 보μž₯ν•˜κ³  정밀도λ₯Ό ν†΅μ œ κ°€λŠ₯ν•œ , λΉ λ₯΄κ²Œ μˆ˜λ ΄ν•˜λŠ” κ°œμΈν™”λœ νŽ˜μ΄μ§€λž­ν¬ 계산 μ•Œκ³ λ¦¬λ“¬μ„ μ œμ‹œν•œλ‹€ . 전톡적인 κ±°λ“­μ œκ³±λ²• (Power 에 좕차가속완화법 (Successive Over Relaxation) κ³Ό 초기 μΆ”μΈ‘ κ°’ 보정법 (Initial Guess 을 ν™œμš©ν•œ 벑터 μž¬μ‚¬μš© μ „λž΅μ„ μ μš©ν•˜μ—¬ 수렴 속도λ₯Ό κ°œμ„ ν•˜μ˜€λ‹€ . μ œμ‹œλœ 방법은 κΈ°μ‘΄ κ±°λ“­μ œκ³±λ²•μ˜ μž₯점인 λ‹¨μˆœμ„±κ³Ό 엄밀성을 μœ μ§€ ν•˜λ©΄μ„œ 도 수렴율과 계산속도λ₯Ό 크게 κ°œμ„  ν•œλ‹€ . λ˜ν•œ κ°œμΈν™”λœ νŽ˜μ΄μ§€λž­ν¬ λ²‘ν„°μ˜ 갱신을 μœ„ν•˜μ—¬ 이전에 계산 λ˜μ–΄ μ €μž₯된 벑터λ₯Ό μž¬μ‚¬μš©ν•˜ μ—¬ , κ°±μ‹  에 λ“œλŠ” μ‹œκ°„μ΄ 크게 λ‹¨μΆ•λœλ‹€ . λ³Έ 방법은 주어진 였차 ν•œκ³„μ— λ„λ‹¬ν•˜λŠ” μ¦‰μ‹œ 결과값을 μ‚°μΆœν•˜λ―€λ‘œ 정확도와 κ³„μ‚°μ‹œκ°„μ„ μœ μ—°ν•˜κ²Œ μ‘°μ ˆν•  수 있으며 μ΄λŠ” ν‘œλ³Έ 기반 μΆ”μ •λ°©λ²•μ΄λ‚˜ μ •ν™•ν•œ 값을 μ‚°μΆœν•˜λŠ” μ—­ν–‰λ ¬ 기반 방법 이 가지지 λͺ»ν•œ νŠΉμ„±μ΄λ‹€ . μ‹€ν—˜ κ²°κ³Ό , λ³Έ 방법은 κ±°λ“­μ œκ³±λ²•μ— λΉ„ν•˜μ—¬ 20 λ°° 이상 λΉ λ₯΄κ²Œ μˆ˜λ ΄ν•œλ‹€λŠ” 것이 ν™•μΈλ˜μ—ˆμœΌλ©° , κΈ° μ œμ‹œλœ 졜고 μ„±λŠ₯ 의 μ•Œκ³ λ¦¬ 듬 보닀 μš°μˆ˜ν•œ μ„±λŠ₯을 λ³΄μ΄λŠ” 것 λ˜ν•œ ν™•μΈλ˜μ—ˆλ‹€1 Introduction 1 2 Preliminaries: Personalized PageRank 4 2.1 Random Walk, PageRank, and Personalized PageRank. 5 2.1.1 Basics on Random Walk 5 2.1.2 PageRank. 6 2.1.3 Personalized PageRank 8 2.2 Characteristics of Personalized PageRank. 9 2.3 Applications of Personalized PageRank. 12 2.4 Previous Work on Personalized PageRank Computation. 17 2.4.1 Basic Algorithms 17 2.4.2 Enhanced Power Iteration 18 2.4.3 Bookmark Coloring Algorithm. 20 2.4.4 Dynamic Programming 21 2.4.5 Monte-Carlo Sampling. 22 2.4.6 Enhanced Direct Solving 24 2.5 Summary 26 3 Personalized PageRank Computation with Initial Guess Revision 30 3.1 Initial Guess Revision and Relaxation 30 3.2 Finding Optimal Weight of Successive Over Relaxation for PPR. 34 3.3 Initial Guess Construction Algorithm for Personalized PageRank. 36 4 Fully Personalized PageRank Algorithm with Initial Guess Revision 42 4.1 FPPR with IGR. 42 4.2 Optimization. 49 4.3 Experiments. 52 5 Personalized PageRank Query Processing with Initial Guess Revision 56 5.1 PPR Query Processing with IGR 56 5.2 Optimization. 64 5.3 Experiments. 67 6 Conclusion 74 Bibliography 77 Appendix 88 Abstract (In Korean) 90Docto

    Efficient Node Proximity and Node Significance Computations in Graphs

    Get PDF
    abstract: Node proximity measures are commonly used for quantifying how nearby or otherwise related to two or more nodes in a graph are. Node significance measures are mainly used to find how much nodes are important in a graph. The measures of node proximity/significance have been highly effective in many predictions and applications. Despite their effectiveness, however, there are various shortcomings. One such shortcoming is a scalability problem due to their high computation costs on large size graphs and another problem on the measures is low accuracy when the significance of node and its degree in the graph are not related. The other problem is that their effectiveness is less when information for a graph is uncertain. For an uncertain graph, they require exponential computation costs to calculate ranking scores with considering all possible worlds. In this thesis, I first introduce Locality-sensitive, Re-use promoting, approximate Personalized PageRank (LR-PPR) which is an approximate personalized PageRank calculating node rankings for the locality information for seeds without calculating the entire graph and reusing the precomputed locality information for different locality combinations. For the identification of locality information, I present Impact Neighborhood Indexing (INI) to find impact neighborhoods with nodes' fingerprints propagation on the network. For the accuracy challenge, I introduce Degree Decoupled PageRank (D2PR) technique to improve the effectiveness of PageRank based knowledge discovery, especially considering the significance of neighbors and degree of a given node. To tackle the uncertain challenge, I introduce Uncertain Personalized PageRank (UPPR) to approximately compute personalized PageRank values on uncertainties of edge existence and Interval Personalized PageRank with Integration (IPPR-I) and Interval Personalized PageRank with Mean (IPPR-M) to compute ranking scores for the case when uncertainty exists on edge weights as interval values.Dissertation/ThesisDoctoral Dissertation Computer Science 201

    A Bipartite Graph-Based Recommender for Crowdfunding with Sparse Data

    Get PDF
    It is a common problem facing recommender to sparse data dealing, especially for crowdfunding recommendations. The collaborative filtering (CF) tends to recommend a user those items only connecting to similar users directly but fails to recommend the items with indirect actions to similar users. Therefore, CF performs poorly in the case of sparse data like Kickstarter. We propose a method of enabling indirect crowdfunding campaign recommendation based on bipartite graph. PersonalRank is applicable to calculate global similarity; as opposed to local similarity, for any node of the network, we use PersonalRank in an iterative manner to produce recommendation list where CF is invalid. Furthermore, we propose a bipartite graph-based CF model by combining CF and PersonalRank. The new model classifies nodes into one of the following two types: user nodes and campaign nodes. For any two types of nodes, the global similarity between them is calculated by PersonalRank. Finally, a recommendation list is generated for any node through CF algorithm. Experimental results show that the bipartite graph-based CF achieves better performance in recommendation for the extremely sparse data from crowdfunding campaigns
    • …
    corecore