29,491 research outputs found
Distance-generalized Core Decomposition
The -core of a graph is defined as the maximal subgraph in which every
vertex is connected to at least other vertices within that subgraph. In
this work we introduce a distance-based generalization of the notion of
-core, which we refer to as the -core, i.e., the maximal subgraph in
which every vertex has at least other vertices at distance within
that subgraph. We study the properties of the -core showing that it
preserves many of the nice features of the classic core decomposition (e.g.,
its connection with the notion of distance-generalized chromatic number) and it
preserves its usefulness to speed-up or approximate distance-generalized
notions of dense structures, such as -club.
Computing the distance-generalized core decomposition over large networks is
intrinsically complex. However, by exploiting clever upper and lower bounds we
can partition the computation in a set of totally independent subcomputations,
opening the door to top-down exploration and to multithreading, and thus
achieving an efficient algorithm
Spartan Daily, September 10, 1993
Volume 101, Issue 9https://scholarworks.sjsu.edu/spartandaily/8437/thumbnail.jp
Clustering and Community Detection with Imbalanced Clusters
Spectral clustering methods which are frequently used in clustering and
community detection applications are sensitive to the specific graph
constructions particularly when imbalanced clusters are present. We show that
ratio cut (RCut) or normalized cut (NCut) objectives are not tailored to
imbalanced cluster sizes since they tend to emphasize cut sizes over cut
values. We propose a graph partitioning problem that seeks minimum cut
partitions under minimum size constraints on partitions to deal with imbalanced
cluster sizes. Our approach parameterizes a family of graphs by adaptively
modulating node degrees on a fixed node set, yielding a set of parameter
dependent cuts reflecting varying levels of imbalance. The solution to our
problem is then obtained by optimizing over these parameters. We present
rigorous limit cut analysis results to justify our approach and demonstrate the
superiority of our method through experiments on synthetic and real datasets
for data clustering, semi-supervised learning and community detection.Comment: Extended version of arXiv:1309.2303 with new applications. Accepted
to IEEE TSIP
- …