17,333 research outputs found
Simplified Energy Landscape for Modularity Using Total Variation
Networks capture pairwise interactions between entities and are frequently
used in applications such as social networks, food networks, and protein
interaction networks, to name a few. Communities, cohesive groups of nodes,
often form in these applications, and identifying them gives insight into the
overall organization of the network. One common quality function used to
identify community structure is modularity. In Hu et al. [SIAM J. App. Math.,
73(6), 2013], it was shown that modularity optimization is equivalent to
minimizing a particular nonconvex total variation (TV) based functional over a
discrete domain. They solve this problem, assuming the number of communities is
known, using a Merriman, Bence, Osher (MBO) scheme.
We show that modularity optimization is equivalent to minimizing a convex
TV-based functional over a discrete domain, again, assuming the number of
communities is known. Furthermore, we show that modularity has no convex
relaxation satisfying certain natural conditions. We therefore, find a
manageable non-convex approximation using a Ginzburg Landau functional, which
provably converges to the correct energy in the limit of a certain parameter.
We then derive an MBO algorithm with fewer hand-tuned parameters than in Hu et
al. and which is 7 times faster at solving the associated diffusion equation
due to the fact that the underlying discretization is unconditionally stable.
Our numerical tests include a hyperspectral video whose associated graph has
2.9x10^7 edges, which is roughly 37 times larger than was handled in the paper
of Hu et al.Comment: 25 pages, 3 figures, 3 tables, submitted to SIAM J. App. Mat
Spectral Clustering with Imbalanced Data
Spectral clustering is sensitive to how graphs are constructed from data
particularly when proximal and imbalanced clusters are present. We show that
Ratio-Cut (RCut) or normalized cut (NCut) objectives are not tailored to
imbalanced data since they tend to emphasize cut sizes over cut values. We
propose a graph partitioning problem that seeks minimum cut partitions under
minimum size constraints on partitions to deal with imbalanced data. Our
approach parameterizes a family of graphs, by adaptively modulating node
degrees on a fixed node set, to yield a set of parameter dependent cuts
reflecting varying levels of imbalance. The solution to our problem is then
obtained by optimizing over these parameters. We present rigorous limit cut
analysis results to justify our approach. We demonstrate the superiority of our
method through unsupervised and semi-supervised experiments on synthetic and
real data sets.Comment: 24 pages, 7 figures. arXiv admin note: substantial text overlap with
arXiv:1302.513
Recent Advances in Graph Partitioning
We survey recent trends in practical algorithms for balanced graph
partitioning together with applications and future research directions
Clustering and Community Detection with Imbalanced Clusters
Spectral clustering methods which are frequently used in clustering and
community detection applications are sensitive to the specific graph
constructions particularly when imbalanced clusters are present. We show that
ratio cut (RCut) or normalized cut (NCut) objectives are not tailored to
imbalanced cluster sizes since they tend to emphasize cut sizes over cut
values. We propose a graph partitioning problem that seeks minimum cut
partitions under minimum size constraints on partitions to deal with imbalanced
cluster sizes. Our approach parameterizes a family of graphs by adaptively
modulating node degrees on a fixed node set, yielding a set of parameter
dependent cuts reflecting varying levels of imbalance. The solution to our
problem is then obtained by optimizing over these parameters. We present
rigorous limit cut analysis results to justify our approach and demonstrate the
superiority of our method through experiments on synthetic and real datasets
for data clustering, semi-supervised learning and community detection.Comment: Extended version of arXiv:1309.2303 with new applications. Accepted
to IEEE TSIP
- …