614 research outputs found
Off-diagonal low-rank preconditioner for difficult PageRank problems
PageRank problem is the cornerstone of Google search engine and is usually stated as solving a huge linear system. Moreover, when the damping factor approaches 1, the spectrum properties of this system deteriorate rapidly and this system becomes difficult to solve. In this paper, we demonstrate that the coefficient matrix of this system can be transferred into a block form by partitioning its rows into special sets. In particular, the off-diagonal part of the block coefficient matrix can be compressed by a simple low-rank factorization, which can be beneficial for solving the PageRank problem. Hence, a matrix partition method is proposed to discover the special sets of rows for supporting the low rank factorization. Then a preconditioner based on the low-rank factorization is proposed for solving difficult PageRank problems. Numerical experiments are presented to support the discussions and to illustrate the effectiveness of the proposed methods. (C) 2018 Elsevier B.V. All rights reserved
ํฐ ๊ทธ๋ํ ์์์์ ๊ฐ์ธํ๋ ํ์ด์ง ๋ญํฌ์ ๋ํ ๋น ๋ฅธ ๊ณ์ฐ ๊ธฐ๋ฒ
ํ์๋
ผ๋ฌธ (๋ฐ์ฌ) -- ์์ธ๋ํ๊ต ๋ํ์ : ๊ณต๊ณผ๋ํ ์ ๊ธฐยท์ปดํจํฐ๊ณตํ๋ถ, 2020. 8. ์ด์๊ตฌ.Computation of Personalized PageRank (PPR) in graphs is an important function that is widely utilized in myriad application domains such as search, recommendation, and knowledge discovery. Because the computation of PPR is an expensive process, a good number of innovative and efficient algorithms for computing PPR have been developed. However, efficient computation of PPR within very large graphs with over millions of nodes is still an open problem. Moreover, previously proposed algorithms cannot handle updates efficiently, thus, severely limiting their capability of handling dynamic graphs. In this paper, we present a fast converging algorithm that guarantees high and controlled precision. We improve the convergence rate of traditional Power Iteration method by adopting successive over-relaxation, and initial guess revision, a vector reuse strategy. The proposed method vastly improves on the traditional Power Iteration in terms of convergence rate and computation time, while retaining its simplicity and strictness. Since it can reuse the previously computed vectors for refreshing PPR vectors, its update performance is also greatly enhanced. Also, since the algorithm halts as soon as it reaches a given error threshold, we can flexibly control the trade-off between accuracy and time, a feature lacking in both sampling-based approximation methods and fully exact methods. Experiments show that the proposed algorithm is at least 20 times faster than the Power Iteration and outperforms other state-of-the-art algorithms.๊ทธ๋ํ
๋ด์์ ๊ฐ์ธํ๋ ํ์ด์ง๋ญํฌ (P ersonalized P age R ank, PPR ๋ฅผ ๊ณ์ฐํ๋ ๊ฒ์ ๊ฒ์ , ์ถ์ฒ , ์ง์๋ฐ๊ฒฌ ๋ฑ ์ฌ๋ฌ ๋ถ์ผ์์ ๊ด๋ฒ์ํ๊ฒ ํ์ฉ๋๋ ์ค์ํ ์์
์ด๋ค . ๊ฐ์ธํ๋ ํ์ด์ง๋ญํฌ๋ฅผ ๊ณ์ฐํ๋ ๊ฒ์ ๊ณ ๋น์ฉ์ ๊ณผ์ ์ด ํ์ํ๋ฏ๋ก , ๊ฐ์ธํ๋ ํ์ด์ง๋ญํฌ๋ฅผ ๊ณ์ฐํ๋ ํจ์จ์ ์ด๊ณ ํ์ ์ ์ธ ๋ฐฉ๋ฒ๋ค์ด ๋ค์ ๊ฐ๋ฐ๋์ด์๋ค . ๊ทธ๋ฌ๋ ์๋ฐฑ๋ง ์ด์์ ๋
ธ๋๋ฅผ ๊ฐ์ง ๋์ฉ๋ ๊ทธ๋ํ์ ๋ํ ํจ์จ์ ์ธ ๊ณ์ฐ์ ์ฌ์ ํ ํด๊ฒฐ๋์ง ์์ ๋ฌธ์ ์ด๋ค . ๊ทธ์ ๋ํ์ฌ , ๊ธฐ์กด ์ ์๋ ์๊ณ ๋ฆฌ๋ฌ๋ค์ ๊ทธ๋ํ ๊ฐฑ์ ์ ํจ์จ์ ์ผ๋ก ๋ค๋ฃจ์ง ๋ชปํ์ฌ ๋์ ์ผ๋ก ๋ณํํ๋ ๊ทธ๋ํ๋ฅผ ๋ค๋ฃจ๋ ๋ฐ์ ํ๊ณ์ ์ด ํฌ๋ค . ๋ณธ ์ฐ๊ตฌ์์๋ ๋์ ์ ๋ฐ๋๋ฅผ ๋ณด์ฅํ๊ณ ์ ๋ฐ๋๋ฅผ ํต์ ๊ฐ๋ฅํ , ๋น ๋ฅด๊ฒ ์๋ ดํ๋ ๊ฐ์ธํ๋ ํ์ด์ง๋ญํฌ ๊ณ์ฐ ์๊ณ ๋ฆฌ๋ฌ์ ์ ์ํ๋ค . ์ ํต์ ์ธ ๊ฑฐ๋ญ์ ๊ณฑ๋ฒ (Power ์ ์ถ์ฐจ๊ฐ์์ํ๋ฒ (Successive Over Relaxation) ๊ณผ ์ด๊ธฐ ์ถ์ธก ๊ฐ ๋ณด์ ๋ฒ (Initial Guess ์ ํ์ฉํ ๋ฒกํฐ ์ฌ์ฌ์ฉ ์ ๋ต์ ์ ์ฉํ์ฌ ์๋ ด ์๋๋ฅผ ๊ฐ์ ํ์๋ค . ์ ์๋ ๋ฐฉ๋ฒ์ ๊ธฐ์กด ๊ฑฐ๋ญ์ ๊ณฑ๋ฒ์ ์ฅ์ ์ธ ๋จ์์ฑ๊ณผ ์๋ฐ์ฑ์ ์ ์ง ํ๋ฉด์ ๋ ์๋ ด์จ๊ณผ ๊ณ์ฐ์๋๋ฅผ ํฌ๊ฒ ๊ฐ์ ํ๋ค . ๋ํ ๊ฐ์ธํ๋ ํ์ด์ง๋ญํฌ ๋ฒกํฐ์ ๊ฐฑ์ ์ ์ํ์ฌ ์ด์ ์ ๊ณ์ฐ ๋์ด ์ ์ฅ๋ ๋ฒกํฐ๋ฅผ ์ฌ์ฌ์ฉํ ์ฌ , ๊ฐฑ์ ์ ๋๋ ์๊ฐ์ด ํฌ๊ฒ ๋จ์ถ๋๋ค . ๋ณธ ๋ฐฉ๋ฒ์ ์ฃผ์ด์ง ์ค์ฐจ ํ๊ณ์ ๋๋ฌํ๋ ์ฆ์ ๊ฒฐ๊ณผ๊ฐ์ ์ฐ์ถํ๋ฏ๋ก ์ ํ๋์ ๊ณ์ฐ์๊ฐ์ ์ ์ฐํ๊ฒ ์กฐ์ ํ ์ ์์ผ๋ฉฐ ์ด๋ ํ๋ณธ ๊ธฐ๋ฐ ์ถ์ ๋ฐฉ๋ฒ์ด๋ ์ ํํ ๊ฐ์ ์ฐ์ถํ๋ ์ญํ๋ ฌ ๊ธฐ๋ฐ ๋ฐฉ๋ฒ ์ด ๊ฐ์ง์ง ๋ชปํ ํน์ฑ์ด๋ค . ์คํ ๊ฒฐ๊ณผ , ๋ณธ ๋ฐฉ๋ฒ์ ๊ฑฐ๋ญ์ ๊ณฑ๋ฒ์ ๋นํ์ฌ 20 ๋ฐฐ ์ด์ ๋น ๋ฅด๊ฒ ์๋ ดํ๋ค๋ ๊ฒ์ด ํ์ธ๋์์ผ๋ฉฐ , ๊ธฐ ์ ์๋ ์ต๊ณ ์ฑ๋ฅ ์ ์๊ณ ๋ฆฌ ๋ฌ ๋ณด๋ค ์ฐ์ํ ์ฑ๋ฅ์ ๋ณด์ด๋ ๊ฒ ๋ํ ํ์ธ๋์๋ค1 Introduction 1
2 Preliminaries: Personalized PageRank 4
2.1 Random Walk, PageRank, and Personalized PageRank. 5
2.1.1 Basics on Random Walk 5
2.1.2 PageRank. 6
2.1.3 Personalized PageRank 8
2.2 Characteristics of Personalized PageRank. 9
2.3 Applications of Personalized PageRank. 12
2.4 Previous Work on Personalized PageRank Computation. 17
2.4.1 Basic Algorithms 17
2.4.2 Enhanced Power Iteration 18
2.4.3 Bookmark Coloring Algorithm. 20
2.4.4 Dynamic Programming 21
2.4.5 Monte-Carlo Sampling. 22
2.4.6 Enhanced Direct Solving 24
2.5 Summary 26
3 Personalized PageRank Computation with Initial Guess Revision 30
3.1 Initial Guess Revision and Relaxation 30
3.2 Finding Optimal Weight of Successive Over Relaxation for PPR. 34
3.3 Initial Guess Construction Algorithm for Personalized PageRank. 36
4 Fully Personalized PageRank Algorithm with Initial Guess Revision 42
4.1 FPPR with IGR. 42
4.2 Optimization. 49
4.3 Experiments. 52
5 Personalized PageRank Query Processing with Initial Guess Revision 56
5.1 PPR Query Processing with IGR 56
5.2 Optimization. 64
5.3 Experiments. 67
6 Conclusion 74
Bibliography 77
Appendix 88
Abstract (In Korean) 90Docto
LINVIEW: Incremental View Maintenance for Complex Analytical Queries
Many analytics tasks and machine learning problems can be naturally expressed
by iterative linear algebra programs. In this paper, we study the incremental
view maintenance problem for such complex analytical queries. We develop a
framework, called LINVIEW, for capturing deltas of linear algebra programs and
understanding their computational cost. Linear algebra operations tend to cause
an avalanche effect where even very local changes to the input matrices spread
out and infect all of the intermediate results and the final view, causing
incremental view maintenance to lose its performance benefit over
re-evaluation. We develop techniques based on matrix factorizations to contain
such epidemics of change. As a consequence, our techniques make incremental
view maintenance of linear algebra practical and usually substantially cheaper
than re-evaluation. We show, both analytically and experimentally, the
usefulness of these techniques when applied to standard analytics tasks. Our
evaluation demonstrates the efficiency of LINVIEW in generating parallel
incremental programs that outperform re-evaluation techniques by more than an
order of magnitude.Comment: 14 pages, SIGMO
Perron-based algorithms for the multilinear pagerank
We consider the multilinear pagerank problem studied in [Gleich, Lim and Yu,
Multilinear Pagerank, 2015], which is a system of quadratic equations with
stochasticity and nonnegativity constraints. We use the theory of quadratic
vector equations to prove several properties of its solutions and suggest new
numerical algorithms. In particular, we prove the existence of a certain
minimal solution, which does not always coincide with the stochastic one that
is required by the problem. We use an interpretation of the solution as a
Perron eigenvector to devise new fixed-point algorithms for its computation,
and pair them with a homotopy continuation strategy. The resulting numerical
method is more reliable than the existing alternatives, being able to solve a
larger number of problems
JGraphT -- A Java library for graph data structures and algorithms
Mathematical software and graph-theoretical algorithmic packages to
efficiently model, analyze and query graphs are crucial in an era where
large-scale spatial, societal and economic network data are abundantly
available. One such package is JGraphT, a programming library which contains
very efficient and generic graph data-structures along with a large collection
of state-of-the-art algorithms. The library is written in Java with stability,
interoperability and performance in mind. A distinctive feature of this library
is the ability to model vertices and edges as arbitrary objects, thereby
permitting natural representations of many common networks including
transportation, social and biological networks. Besides classic graph
algorithms such as shortest-paths and spanning-tree algorithms, the library
contains numerous advanced algorithms: graph and subgraph isomorphism; matching
and flow problems; approximation algorithms for NP-hard problems such as
independent set and TSP; and several more exotic algorithms such as Berge graph
detection. Due to its versatility and generic design, JGraphT is currently used
in large-scale commercial, non-commercial and academic research projects. In
this work we describe in detail the design and underlying structure of the
library, and discuss its most important features and algorithms. A
computational study is conducted to evaluate the performance of JGraphT versus
a number of similar libraries. Experiments on a large number of graphs over a
variety of popular algorithms show that JGraphT is highly competitive with
other established libraries such as NetworkX or the BGL.Comment: Major Revisio
CLUDE: An Efficient Algorithm for LU Decomposition Over a Sequence of Evolving Graphs
Session: Matrix Factorization, Clustering and Probabilistic DataIn many applications, entities and their relationships are
represented by graphs. Examples include the WWW (web
pages and hyperlinks) and bibliographic networks (authors
and co-authorship). A graph can be conveniently modeled
by a matrix from which various quantitative measures are
derived. Some example measures include PageRank and
SALSA (which measure nodesโ importance), and Personalized
PageRank and Random Walk with Restart (which measure
proximities between nodes). To compute these measures,
linear systems of the form Ax = b, where A is a matrix
that captures a graphโs structure, need to be solved. To
facilitate solving the linear system, the matrix A is often decomposed
into two triangular matrices (L and U). In a dynamic
world, the graph that models it changes with time and
thus is the matrix A that represents the graph. We consider
a sequence of evolving graphs and its associated sequence of
evolving matrices. We study how LU-decomposition should
be done over the sequence so that (1) the decomposition
is efficient and (2) the resulting LU matrices best preserve
the sparsity of the matrices Aโs (i.e., the number of extra
non-zero entries introduced in L and U are minimized.) We
propose a cluster-based algorithm CLUDE for solving the
problem. Through an experimental study, we show that
CLUDE is about an order of magnitude faster than the
traditional incremental update algorithm. The number of
extra non-zero entries introduced by CLUDE is also about
an order of magnitude fewer than that of the traditional
algorithm. CLUDE is thus an efficient algorithm for LU decomposition
that produces high-quality LU matrices over an
evolving matrix sequence.published_or_final_versio
Asynchronous iterative solution for dominant eigenvectors with applications in performance modelling and PageRank
Imperial Users onl
- โฆ