10 research outputs found
Sublinear algorithms for local graph centrality estimation
We study the complexity of local graph centrality estimation, with the goal
of approximating the centrality score of a given target node while exploring
only a sublinear number of nodes/arcs of the graph and performing a sublinear
number of elementary operations. We develop a technique, that we apply to the
PageRank and Heat Kernel centralities, for building a low-variance score
estimator through a local exploration of the graph. We obtain an algorithm
that, given any node in any graph of arcs, with probability
computes a multiplicative -approximation of its score by
examining only nodes/arcs, where and are respectively the maximum and
average outdegree of the graph (omitting for readability
and
factors). A similar bound holds for computational complexity. We also prove a
lower bound of for both query complexity and computational complexity. Moreover,
our technique yields a query complexity algorithm for the
graph access model of [Brautbar et al., 2010], widely used in social network
mining; we show this algorithm is optimal up to a sublogarithmic factor. These
are the first algorithms yielding worst-case sublinear bounds for general
directed graphs and any choice of the target node.Comment: 29 pages, 1 figur
Local Hypergraph Clustering using Capacity Releasing Diffusion
Local graph clustering is an important machine learning task that aims to
find a well-connected cluster near a set of seed nodes. Recent results have
revealed that incorporating higher order information significantly enhances the
results of graph clustering techniques. The majority of existing research in
this area focuses on spectral graph theory-based techniques. However, an
alternative perspective on local graph clustering arises from using max-flow
and min-cut on the objectives, which offer distinctly different guarantees. For
instance, a new method called capacity releasing diffusion (CRD) was recently
proposed and shown to preserve local structure around the seeds better than
spectral methods. The method was also the first local clustering technique that
is not subject to the quadratic Cheeger inequality by assuming a good cluster
near the seed nodes. In this paper, we propose a local hypergraph clustering
technique called hypergraph CRD (HG-CRD) by extending the CRD process to
cluster based on higher order patterns, encoded as hyperedges of a hypergraph.
Moreover, we theoretically show that HG-CRD gives results about a quantity
called motif conductance, rather than a biased version used in previous
experiments. Experimental results on synthetic datasets and real world graphs
show that HG-CRD enhances the clustering quality.Comment: 18 pages, 6 figure