1 research outputs found
Graph Distances and Clustering
With a view on graph clustering, we present a definition of vertex-to-vertex
distance which is based on shared connectivity. We argue that vertices sharing
more connections are closer to each other than vertices sharing fewer
connections. Our thesis is centered on the widely accepted notion that strong
clusters are formed by high levels of induced subgraph density, where subgraphs
represent clusters. We argue these clusters are formed by grouping vertices
deemed to be similar in their connectivity. At the cluster level (induced
subgraph level), our thesis translates into low mean intra-cluster distances.
Our definition differs from the usual shortest-path geodesic distance. In this
article, we compare three distance measures from the literature. Our benchmark
is the accuracy of each measure's reflection of intra-cluster density, when
aggregated (averaged) at the cluster level. We conduct our tests on synthetic
graphs generated using the planted partition model, where clusters and
intra-cluster density are known in advance. We examine correlations between
mean intra-cluster distances and intra-cluster densities. Our numerical
experiments show that Jaccard and Otsuka-Ochiai offer very accurate measures of
density, when averaged over vertex pairs within clusters.Comment: 11 pages. 4 figure