4,720 research outputs found

    Axioms for graph clustering quality functions

    Get PDF
    We investigate properties that intuitively ought to be satisfied by graph clustering quality functions, that is, functions that assign a score to a clustering of a graph. Graph clustering, also known as network community detection, is often performed by optimizing such a function. Two axioms tailored for graph clustering quality functions are introduced, and the four axioms introduced in previous work on distance based clustering are reformulated and generalized for the graph setting. We show that modularity, a standard quality function for graph clustering, does not satisfy all of these six properties. This motivates the derivation of a new family of quality functions, adaptive scale modularity, which does satisfy the proposed axioms. Adaptive scale modularity has two parameters, which give greater flexibility in the kinds of clusterings that can be found. Standard graph clustering quality functions, such as normalized cut and unnormalized cut, are obtained as special cases of adaptive scale modularity. In general, the results of our investigation indicate that the considered axiomatic framework covers existing `good' quality functions for graph clustering, and can be used to derive an interesting new family of quality functions.Comment: 23 pages. Full text and sources available on: http://www.cs.ru.nl/~T.vanLaarhoven/graph-clustering-axioms-2014

    Optimizing an Organized Modularity Measure for Topographic Graph Clustering: a Deterministic Annealing Approach

    Full text link
    This paper proposes an organized generalization of Newman and Girvan's modularity measure for graph clustering. Optimized via a deterministic annealing scheme, this measure produces topologically ordered graph clusterings that lead to faithful and readable graph representations based on clustering induced graphs. Topographic graph clustering provides an alternative to more classical solutions in which a standard graph clustering method is applied to build a simpler graph that is then represented with a graph layout algorithm. A comparative study on four real world graphs ranging from 34 to 1 133 vertices shows the interest of the proposed approach with respect to classical solutions and to self-organizing maps for graphs

    Identification of network modules by optimization of ratio association

    Get PDF
    We introduce a novel method for identifying the modular structures of a network based on the maximization of an objective function: the ratio association. This cost function arises when the communities detection problem is described in the probabilistic autoencoder frame. An analogy with kernel k-means methods allows to develop an efficient optimization algorithm, based on the deterministic annealing scheme. The performance of the proposed method is shown on a real data set and on simulated networks

    Multistep greedy algorithm identifies community structure in real-world and computer-generated networks

    Full text link
    We have recently introduced a multistep extension of the greedy algorithm for modularity optimization. The extension is based on the idea that merging l pairs of communities (l>1) at each iteration prevents premature condensation into few large communities. Here, an empirical formula is presented for the choice of the step width l that generates partitions with (close to) optimal modularity for 17 real-world and 1100 computer-generated networks. Furthermore, an in-depth analysis of the communities of two real-world networks (the metabolic network of the bacterium E. coli and the graph of coappearing words in the titles of papers coauthored by Martin Karplus) provides evidence that the partition obtained by the multistep greedy algorithm is superior to the one generated by the original greedy algorithm not only with respect to modularity but also according to objective criteria. In other words, the multistep extension of the greedy algorithm reduces the danger of getting trapped in local optima of modularity and generates more reasonable partitions.Comment: 17 pages, 2 figure

    Consensus clustering in complex networks

    Get PDF
    The community structure of complex networks reveals both their organization and hidden relationships among their constituents. Most community detection methods currently available are not deterministic, and their results typically depend on the specific random seeds, initial conditions and tie-break rules adopted for their execution. Consensus clustering is used in data analysis to generate stable results out of a set of partitions delivered by stochastic methods. Here we show that consensus clustering can be combined with any existing method in a self-consistent way, enhancing considerably both the stability and the accuracy of the resulting partitions. This framework is also particularly suitable to monitor the evolution of community structure in temporal networks. An application of consensus clustering to a large citation network of physics papers demonstrates its capability to keep track of the birth, death and diversification of topics.Comment: 11 pages, 12 figures. Published in Scientific Report

    Unifying Sparsest Cut, Cluster Deletion, and Modularity Clustering Objectives with Correlation Clustering

    Get PDF
    Graph clustering, or community detection, is the task of identifying groups of closely related objects in a large network. In this paper we introduce a new community-detection framework called LambdaCC that is based on a specially weighted version of correlation clustering. A key component in our methodology is a clustering resolution parameter, λ\lambda, which implicitly controls the size and structure of clusters formed by our framework. We show that, by increasing this parameter, our objective effectively interpolates between two different strategies in graph clustering: finding a sparse cut and forming dense subgraphs. Our methodology unifies and generalizes a number of other important clustering quality functions including modularity, sparsest cut, and cluster deletion, and places them all within the context of an optimization problem that has been well studied from the perspective of approximation algorithms. Our approach is particularly relevant in the regime of finding dense clusters, as it leads to a 2-approximation for the cluster deletion problem. We use our approach to cluster several graphs, including large collaboration networks and social networks

    Simplified Energy Landscape for Modularity Using Total Variation

    Get PDF
    Networks capture pairwise interactions between entities and are frequently used in applications such as social networks, food networks, and protein interaction networks, to name a few. Communities, cohesive groups of nodes, often form in these applications, and identifying them gives insight into the overall organization of the network. One common quality function used to identify community structure is modularity. In Hu et al. [SIAM J. App. Math., 73(6), 2013], it was shown that modularity optimization is equivalent to minimizing a particular nonconvex total variation (TV) based functional over a discrete domain. They solve this problem, assuming the number of communities is known, using a Merriman, Bence, Osher (MBO) scheme. We show that modularity optimization is equivalent to minimizing a convex TV-based functional over a discrete domain, again, assuming the number of communities is known. Furthermore, we show that modularity has no convex relaxation satisfying certain natural conditions. We therefore, find a manageable non-convex approximation using a Ginzburg Landau functional, which provably converges to the correct energy in the limit of a certain parameter. We then derive an MBO algorithm with fewer hand-tuned parameters than in Hu et al. and which is 7 times faster at solving the associated diffusion equation due to the fact that the underlying discretization is unconditionally stable. Our numerical tests include a hyperspectral video whose associated graph has 2.9x10^7 edges, which is roughly 37 times larger than was handled in the paper of Hu et al.Comment: 25 pages, 3 figures, 3 tables, submitted to SIAM J. App. Mat
    • …
    corecore