114 research outputs found

    Modularity of regular and treelike graphs

    Full text link
    Clustering algorithms for large networks typically use modularity values to test which partitions of the vertex set better represent structure in the data. The modularity of a graph is the maximum modularity of a partition. We consider the modularity of two kinds of graphs. For rr-regular graphs with a given number of vertices, we investigate the minimum possible modularity, the typical modularity, and the maximum possible modularity. In particular, we see that for random cubic graphs the modularity is usually in the interval (0.666,0.804)(0.666, 0.804), and for random rr-regular graphs with large rr it usually is of order 1/r1/\sqrt{r}. These results help to establish baselines for statistical tests on regular graphs. The modularity of cycles and low degree trees is known to be close to 1: we extend these results to `treelike' graphs, where the product of treewidth and maximum degree is much less than the number of edges. This yields for example the (deterministic) lower bound 0.6660.666 mentioned above on the modularity of random cubic graphs.Comment: 25 page

    Communities and bottlenecks: Trees and treelike networks have high modularity

    Full text link
    Much effort has gone into understanding the modular nature of complex networks. Communities, also known as clusters or modules, are typically considered to be densely interconnected groups of nodes that are only sparsely connected to other groups in the network. Discovering high quality communities is a difficult and important problem in a number of areas. The most popular approach is the objective function known as modularity, used both to discover communities and to measure their strength. To understand the modular structure of networks it is then crucial to know how such functions evaluate different topologies, what features they account for, and what implicit assumptions they may make. We show that trees and treelike networks can have unexpectedly and often arbitrarily high values of modularity. This is surprising since trees are maximally sparse connected graphs and are not typically considered to possess modular structure, yet the nonlocal null model used by modularity assigns low probabilities, and thus high significance, to the densities of these sparse tree communities. We further study the practical performance of popular methods on model trees and on a genealogical data set and find that the discovered communities also have very high modularity, often approaching its maximum value. Statistical tests reveal the communities in trees to be significant, in contrast with known results for partitions of sparse, random graphs.Comment: 9 pages, 5 figure

    Put three and three together: Triangle-driven community detection

    Get PDF
    Community detection has arisen as one of the most relevant topics in the field of graph data mining due to its applications in many fields such as biology, social networks, or network traffic analysis. Although the existing metrics used to quantify the quality of a community work well in general, under some circumstances, they fail at correctly capturing such notion. The main reason is that these metrics consider the internal community edges as a set, but ignore how these actually connect the vertices of the community. We propose the Weighted Community Clustering (WCC), which is a new community metric that takes the triangle instead of the edge as the minimal structural motif indicating the presence of a strong relation in a graph. We theoretically analyse WCC in depth and formally prove, by means of a set of properties, that the maximization of WCC guarantees communities with cohesion and structure. In addition, we propose Scalable Community Detection (SCD), a community detection algorithm based on WCC, which is designed to be fast and scalable on SMP machines, showing experimentally that WCC correctly captures the concept of community in social networks using real datasets. Finally, using ground-truth data, we show that SCD provides better quality than the best disjoint community detection algorithms of the state of the art while performing faster.Peer ReviewedPostprint (author's final draft

    Spectral redemption: clustering sparse networks

    Get PDF
    Spectral algorithms are classic approaches to clustering and community detection in networks. However, for sparse networks the standard versions of these algorithms are suboptimal, in some cases completely failing to detect communities even when other algorithms such as belief propagation can do so. Here we introduce a new class of spectral algorithms based on a non-backtracking walk on the directed edges of the graph. The spectrum of this operator is much better-behaved than that of the adjacency matrix or other commonly used matrices, maintaining a strong separation between the bulk eigenvalues and the eigenvalues relevant to community structure even in the sparse case. We show that our algorithm is optimal for graphs generated by the stochastic block model, detecting communities all the way down to the theoretical limit. We also show the spectrum of the non-backtracking operator for some real-world networks, illustrating its advantages over traditional spectral clustering.Comment: 11 pages, 6 figures. Clarified to what extent our claims are rigorous, and to what extent they are conjectures; also added an interpretation of the eigenvectors of the 2n-dimensional version of the non-backtracking matri

    Modularity of minor-free graphs

    Full text link
    We prove that a class of graphs with an excluded minor and with the maximum degree sublinear in the number of edges is maximally modular, that is, modularity tends to 1 as the number of edges tends to infinity.Comment: 7 pages, 1 figur

    The parameterised complexity of computing the maximum modularity of a graph

    Get PDF
    The maximum modularity of a graph is a parameter widely used to describe the level of clustering or community structure in a network. Determining the maximum modularity of a graph is known to be NP-complete in general, and in practice a range of heuristics are used to construct partitions of the vertex-set which give lower bounds on the maximum modularity but without any guarantee on how close these bounds are to the true maximum. In this paper we investigate the parameterised complexity of determining the maximum modularity with respect to various standard structural parameterisations of the input graph G. We show that the problem belongs to FPT when parameterised by the size of a minimum vertex cover for G, and is solvable in polynomial time whenever the treewidth or max leaf number of G is bounded by some fixed constant; we also obtain an FPT algorithm, parameterised by treewidth, to compute any constant-factor approximation to the maximum modularity. On the other hand we show that the problem is W[1]-hard (and hence unlikely to admit an FPT algorithm) when parameterised simultaneously by pathwidth and the size of a minimum feedback vertex set

    On the modularity of 3-regular random graphs and random graphs with given degree sequences

    Full text link
    The modularity of a graph is a parameter introduced by Newman and Girvan measuring its community structure; the higher its value (between 00 and 11), the more clustered a graph is. In this paper we show that the modularity of a random 33-regular graph is at least 0.6670260.667026 asymptotically almost surely (a.a.s.), thereby proving a conjecture of McDiarmid and Skerman stating that a random 33-regular graph has modularity strictly larger than 23\frac{2}{3} a.a.s. We also improve the upper bound given therein by showing that the modularity of such a graph is a.a.s. at most 0.7899980.789998. For a uniformly chosen graph GnG_n over a given bounded degree sequence with average degree d(Gn)d(G_n) and with CC(Gn)|CC(G_n)| many connected components, we distinguish two regimes with respect to the existence of a giant component. In more detail, we precisely compute the second term of the modularity in the subcritical regime. In the supercritical regime, we further prove that there is ε>0\varepsilon > 0 depending on the degree sequence, for which the modularity is a.a.s. at least \begin{equation*} \dfrac{2\left(1 - \mu\right)}{d(G_n)}+\varepsilon, \end{equation*} where μ\mu is the asymptotically almost sure limit of CC(Gn)n\dfrac{|CC(G_n)|}{n}.Comment: 41 page

    Critical phenomena in complex networks

    Full text link
    The combination of the compactness of networks, featuring small diameters, and their complex architectures results in a variety of critical effects dramatically different from those in cooperative systems on lattices. In the last few years, researchers have made important steps toward understanding the qualitatively new critical phenomena in complex networks. We review the results, concepts, and methods of this rapidly developing field. Here we mostly consider two closely related classes of these critical phenomena, namely structural phase transitions in the network architectures and transitions in cooperative models on networks as substrates. We also discuss systems where a network and interacting agents on it influence each other. We overview a wide range of critical phenomena in equilibrium and growing networks including the birth of the giant connected component, percolation, k-core percolation, phenomena near epidemic thresholds, condensation transitions, critical phenomena in spin models placed on networks, synchronization, and self-organized criticality effects in interacting systems on networks. We also discuss strong finite size effects in these systems and highlight open problems and perspectives.Comment: Review article, 79 pages, 43 figures, 1 table, 508 references, extende
    corecore