Search CORE

871 research outputs found

Improving Random Walk Estimation Accuracy with Uniform Restarts

Author: A. Brauer
A. Sinclair
A.L. Barabási
C. Gkantsidis
H. Baumgartel
L. Lovász
L. Lovász
M. Bressan
N. Bisnik
N. Litvak
S. Brin
Publication venue: HAL CCSD
Publication date: 01/01/2010
Field of study

This work proposes and studies the properties of a hybrid sampling scheme that mixes independent uniform node sampling and random walk (RW)-based crawling. We show that our sampling method combines the strengths of both uniform and RW sampling while minimizing their drawbacks. In particular, our method increases the spectral gap of the random walk, and hence, accelerates convergence to the stationary distribution. The proposed method resembles PageRank but unlike PageRank preserves time-reversibility. Applying our hybrid RW to the problem of estimating degree distributions of graphs shows promising results

Crossref

INRIA a CCSD electronic archive server

Improving Random Walk Estimation Accuracy with Uniform Restarts

Author: A. Brauer
A. Sinclair
A.L. Barabási
C. Gkantsidis
H. Baumgartel
L. Lovász
L. Lovász
M. Bressan
N. Bisnik
N. Litvak
S. Brin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Crossref

Personalized PageRank with Node-dependent Restart

Author: F Fouss
K Avrachenkov
K Avrachenkov
K Avrachenkov
P Chen
PG Constantine
PG Constantine
X Liu
Publication venue
Publication date: 01/01/2014
Field of study

Personalized PageRank is an algorithm to classify the improtance of web pages on a user-dependent basis. We introduce two generalizations of Personalized PageRank with node-dependent restart. The first generalization is based on the proportion of visits to nodes before the restart, whereas the second generalization is based on the probability of visited node just before the restart. In the original case of constant restart probability, the two measures coincide. We discuss interesting particular cases of restart probabilities and restart distributions. We show that the both generalizations of Personalized PageRank have an elegant expression connecting the so-called direct and reverse Personalized PageRanks that yield a symmetry property of these Personalized PageRanks

arXiv.org e-Print Archive

Repository TU/e

Crossref

Pure OAI Repository

INRIA a CCSD electronic archive server

Sampling Online Social Networks via Heterogeneous Statistics

Author: Li Zhipeng
Ma Richard T. B.
Wang Xin
Xu Yinlong
Publication venue
Publication date: 18/12/2015
Field of study

Most sampling techniques for online social networks (OSNs) are based on a particular sampling method on a single graph, which is referred to as a statistics. However, various realizing methods on different graphs could possibly be used in the same OSN, and they may lead to different sampling efficiencies, i.e., asymptotic variances. To utilize multiple statistics for accurate measurements, we formulate a mixture sampling problem, through which we construct a mixture unbiased estimator which minimizes asymptotic variance. Given fixed sampling budgets for different statistics, we derive the optimal weights to combine the individual estimators; given fixed total budget, we show that a greedy allocation towards the most efficient statistics is optimal. In practice, the sampling efficiencies of statistics can be quite different for various targets and are unknown before sampling. To solve this problem, we design a two-stage framework which adaptively spends a partial budget to test different statistics and allocates the remaining budget to the inferred best statistics. We show that our two-stage framework is a generalization of 1) randomly choosing a statistics and 2) evenly allocating the total budget among all available statistics, and our adaptive algorithm achieves higher efficiency than these benchmark strategies in theory and experiment

arXiv.org e-Print Archive

Crossref

On sampling social networking services

Author: Wang Baiyang
Publication venue
Publication date: 20/02/2013
Field of study

This article aims at summarizing the existing methods for sampling social networking services and proposing a faster confidence interval for related sampling methods. It also includes comparisons of common network sampling techniques

arXiv.org e-Print Archive

CiteSeerX

Bayesian Inference of Online Social Network Statistics via Lightweight Random Walk Crawls

Author: Avrachenkov Konstantin
Ribeiro Bruno
Sreedharan Jithin K.
Publication venue
Publication date: 01/10/2015
Field of study

Online social networks (OSN) contain extensive amount of information about the underlying society that is yet to be explored. One of the most feasible technique to fetch information from OSN, crawling through Application Programming Interface (API) requests, poses serious concerns over the the guarantees of the estimates. In this work, we focus on making reliable statistical inference with limited API crawls. Based on regenerative properties of the random walks, we propose an unbiased estimator for the aggregated sum of functions over edges and proved the connection between variance of the estimator and spectral gap. In order to facilitate Bayesian inference on the true value of the estimator, we derive the approximate posterior distribution of the estimate. Later the proposed ideas are validated with numerical experiments on inference problems in real-world networks

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server