1 research outputs found
Effective Resistance-based Germination of Seed Sets for Community Detection
Community detection is, at its core, an attempt to attach an interpretable
function to an otherwise indecipherable form. The importance of labeling
communities has obvious implications for identifying clusters in social
networks, but it has a number of equally relevant applications in product
recommendations, biological systems, and many forms of classification. The
local variety of community detection starts with a small set of labeled seed
nodes, and aims to estimate the community containing these nodes. One of the
most ubiquitous methods - due to its simplicity and efficiency - is
personalized PageRank. The most obvious bottleneck for deploying this form of
PageRank successfully is the quality of the seeds. We introduce a "germination"
stage for these seeds, where an effective resistance-based approach is used to
increase the quality and number of seeds from which a community is detected. By
breaking seed set expansion into a two-step process, we aim to utilize two
distinct random walk-based approaches in the regimes in which they excel. In
synthetic and real network data, a simple, greedy algorithm which minimizes the
effective resistance diameter combined with PageRank achieves clear
improvements in precision and recall over a standalone PageRank procedure.Comment: 10 pages, 4 figures, currently under review for conference submissio