4,719 research outputs found
Beyond Triangles: A Distributed Framework for Estimating 3-profiles of Large Graphs
We study the problem of approximating the -profile of a large graph.
-profiles are generalizations of triangle counts that specify the number of
times a small graph appears as an induced subgraph of a large graph. Our
algorithm uses the novel concept of -profile sparsifiers: sparse graphs that
can be used to approximate the full -profile counts for a given large graph.
Further, we study the problem of estimating local and ego -profiles, two
graph quantities that characterize the local neighborhood of each vertex of a
graph.
Our algorithm is distributed and operates as a vertex program over the
GraphLab PowerGraph framework. We introduce the concept of edge pivoting which
allows us to collect -hop information without maintaining an explicit
-hop neighborhood list at each vertex. This enables the computation of all
the local -profiles in parallel with minimal communication.
We test out implementation in several experiments scaling up to cores
on Amazon EC2. We find that our algorithm can estimate the -profile of a
graph in approximately the same time as triangle counting. For the harder
problem of ego -profiles, we introduce an algorithm that can estimate
profiles of hundreds of thousands of vertices in parallel, in the timescale of
minutes.Comment: To appear in part at KDD'1
Parameter estimators of random intersection graphs with thinned communities
This paper studies a statistical network model generated by a large number of
randomly sized overlapping communities, where any pair of nodes sharing a
community is linked with probability via the community. In the special case
with the model reduces to a random intersection graph which is known to
generate high levels of transitivity also in the sparse context. The parameter
adds a degree of freedom and leads to a parsimonious and analytically
tractable network model with tunable density, transitivity, and degree
fluctuations. We prove that the parameters of this model can be consistently
estimated in the large and sparse limiting regime using moment estimators based
on partially observed densities of links, 2-stars, and triangles.Comment: 15 page
- …