Search CORE

192 research outputs found

Fast matrix computations for pair-wise and column-wise commute times and Katz scores

Author: Andersen [Andersen et al. 06] Reid
Boldi [Boldi et al. 11] Paolo
Chung [Chung et al. 03] Fan
Davis [Davis and Rabinowitz 84] P. J.
Golub [Golub and Meurant 10] Gene H.
Lanczos [Lanczos 50] Cornelius
Lanczos [Lanczos 53] Cornelius
Liben-Nowell [Liben-Nowell and Kleinberg 03] David
McSherry [McSherry 05] Frank
Mihail [Mihail and Papadimitriou 02] Milena
Varga [Varga 62] R. S.
Publication venue: 'Informa UK Limited'
Publication date: 19/04/2011
Field of study

We first explore methods for approximating the commute time and Katz score between a pair of nodes. These methods are based on the approach of matrices, moments, and quadrature developed in the numerical linear algebra community. They rely on the Lanczos process and provide upper and lower bounds on an estimate of the pair-wise scores. We also explore methods to approximate the commute times and Katz scores from a node to all other nodes in the graph. Here, our approach for the commute times is based on a variation of the conjugate gradient algorithm, and it provides an estimate of all the diagonals of the inverse of a matrix. Our technique for the Katz scores is based on exploiting an empirical localization property of the Katz matrix. We adopt algorithms used for personalized PageRank computing to these Katz scores and theoretically show that this approach is convergent. We evaluate these methods on 17 real world graphs ranging in size from 1000 to 1,000,000 nodes. Our results show that our pair-wise commute time method and column-wise Katz algorithm both have attractive theoretical properties and empirical performance.Comment: 35 pages, journal version of http://dx.doi.org/10.1007/978-3-642-18009-5_13 which has been submitted for publication. Please see http://www.cs.purdue.edu/homes/dgleich/publications/2011/codes/fast-katz/ for supplemental code

arXiv.org e-Print Archive

Crossref

From random walks to distances on unweighted graphs

Author: Hashimoto Tatsunori B.
Jaakkola Tommi S.
Sun Yi
Publication venue
Publication date: 02/11/2015
Field of study

Large unweighted directed graphs are commonly used to capture relations between entities. A fundamental problem in the analysis of such networks is to properly define the similarity or dissimilarity between any two vertices. Despite the significance of this problem, statistical characterization of the proposed metrics has been limited. We introduce and develop a class of techniques for analyzing random walks on graphs using stochastic calculus. Using these techniques we generalize results on the degeneracy of hitting times and analyze a metric based on the Laplace transformed hitting time (LTHT). The metric serves as a natural, provably well-behaved alternative to the expected hitting time. We establish a general correspondence between hitting times of the Brownian motion and analogous hitting times on the graph. We show that the LTHT is consistent with respect to the underlying metric of a geometric graph, preserves clustering tendency, and remains robust against random addition of non-geometric edges. Tests on simulated and real-world data show that the LTHT matches theoretical predictions and outperforms alternatives.Comment: To appear in NIPS 201

arXiv.org e-Print Archive

DSpace@MIT

Random Walks: A Review of Algorithms and Applications

Author: Fu Yonghao
Kong Xiangjie
Liu Jiaying
Nie Hansong
Wan Liangtian
Xia Feng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

A random walk is known as a random process which describes a path including a succession of random steps in the mathematical space. It has increasingly been popular in various disciplines such as mathematics and computer science. Furthermore, in quantum mechanics, quantum walks can be regarded as quantum analogues of classical random walks. Classical random walks and quantum walks can be used to calculate the proximity between nodes and extract the topology in the network. Various random walk related models can be applied in different fields, which is of great significance to downstream tasks such as link prediction, recommendation, computer vision, semi-supervised learning, and network embedding. In this paper, we aim to provide a comprehensive review of classical random walks and quantum walks. We first review the knowledge of classical random walks and quantum walks, including basic concepts and some typical algorithms. We also compare the algorithms based on quantum walks and classical random walks from the perspective of time complexity. Then we introduce their applications in the field of computer science. Finally we discuss the open issues from the perspectives of efficiency, main-memory volume, and computing time of existing algorithms. This study aims to contribute to this growing area of research by exploring random walks and quantum walks together.Comment: 13 pages, 4 figure

arXiv.org e-Print Archive

Federation ResearchOnline

Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Author: Popescu-Belis Andrei
Yazdani Majid
Publication venue: 'Elsevier BV'
Publication date: 19/12/2013
Field of study

We propose a method for computing semantic relatedness between words or texts by using knowledge from hypertext encyclopedias such as Wikipedia. A network of concepts is built by filtering the encyclopedia's articles, each concept corresponding to an article. Two types of weighted links between concepts are considered: one based on hyperlinks between the texts of the articles, and another one based on the lexical similarity between them. We propose and implement an efficient random walk algorithm that computes the distance between nodes, and then between sets of nodes, using the visiting probability from one (set of) node(s) to another. Moreover, to make the algorithm tractable, we propose and validate empirically two truncation methods, and then use an embedding space to learn an approximation of visiting probability. To evaluate the proposed distance, we apply our method to four important tasks in natural language processing: word similarity, document similarity, document clustering and classification, and ranking in information retrieval. The performance of the method is state-of-the-art or close to it for each task, thus demonstrating the generality of the knowledge resource. Moreover, using both hyperlinks and lexical similarity links improves the scores with respect to a method using only one of them, because hyperlinks bring additional real-world knowledge not captured by lexical similarity. (C) 2012 Elsevier B.V. All rights reserved

Infoscience - École polytechnique fédérale de Lausanne

Elsevier - Publisher Connector