Search CORE

3,279 research outputs found

Estimating and Sampling Graphs with Multidimensional Random Walks

Author: Ribeiro Bruno
Towsley Don
Publication venue
Publication date: 01/01/2010
Field of study

Estimating characteristics of large graphs via sampling is a vital part of the study of complex networks. Current sampling methods such as (independent) random vertex and random walks are useful but have drawbacks. Random vertex sampling may require too many resources (time, bandwidth, or money). Random walks, which normally require fewer resources per sample, can suffer from large estimation errors in the presence of disconnected or loosely connected graphs. In this work we propose a new

m

-dimensional random walk that uses

m

dependent random walkers. We show that the proposed sampling method, which we call Frontier sampling, exhibits all of the nice sampling properties of a regular random walk. At the same time, our simulations over large real world graphs show that, in the presence of disconnected or loosely connected components, Frontier sampling exhibits lower estimation errors than regular random walks. We also show that Frontier sampling is more suitable than random vertex sampling to sample the tail of the degree distribution of the graph

arXiv.org e-Print Archive

CiteSeerX

On sampling nodes in a network

Author: CHIERICHETTI FLAVIO
Dasgupta Anirban
Kumar Ravi
Lattanzi Silvio
Sarlós Tamás
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

Random walk is an important tool in many graph mining applications including estimating graph parameters, sampling portions of the graph, and extracting dense communities. In this paper we consider the problem of sampling nodes from a large graph according to a prescribed distribution by using random walk as the basic primitive. Our goal is to obtain algorithms that make a small number of queries to the graph but output a node that is sampled according to the prescribed distribution. Focusing on the uniform distribution case, we study the query complexity of three algorithms and show a near-tight bound expressed in terms of the parameters of the graph such as average degree and the mixing time. Both theoretically and empirically, we show that some algorithms are preferable in practice than the others. We also extend our study to the problem of sampling nodes according to some polynomial function of their degrees; this has implications for designing efficient algorithms for applications such as triangle counting

IIT Gandhinagar

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Archivio della ricerca- Università di Roma La Sapienza

2.5K-Graphs: from Sampling to Generation

Author: Gjoka Minas
Kurant Maciej
Markopoulou Athina
Publication venue
Publication date: 17/08/2012
Field of study

Understanding network structure and having access to realistic graphs plays a central role in computer and social networks research. In this paper, we propose a complete, and practical methodology for generating graphs that resemble a real graph of interest. The metrics of the original topology we target to match are the joint degree distribution (JDD) and the degree-dependent average clustering coefficient (

\bar{c}(k)

). We start by developing efficient estimators for these two metrics based on a node sample collected via either independence sampling or random walks. Then, we process the output of the estimators to ensure that the target properties are realizable. Finally, we propose an efficient algorithm for generating topologies that have the exact target JDD and a

\bar{c}(k)

close to the target. Extensive simulations using real-life graphs show that the graphs generated by our methodology are similar to the original graph with respect to, not only the two target metrics, but also a wide range of other topological metrics; furthermore, our generator is order of magnitudes faster than state-of-the-art techniques

arXiv.org e-Print Archive

CiteSeerX

Crossref

eScholarship - University of California