114 research outputs found
Modularity of regular and treelike graphs
Clustering algorithms for large networks typically use modularity values to
test which partitions of the vertex set better represent structure in the data.
The modularity of a graph is the maximum modularity of a partition. We consider
the modularity of two kinds of graphs.
For -regular graphs with a given number of vertices, we investigate the
minimum possible modularity, the typical modularity, and the maximum possible
modularity. In particular, we see that for random cubic graphs the modularity
is usually in the interval , and for random -regular graphs
with large it usually is of order . These results help to
establish baselines for statistical tests on regular graphs.
The modularity of cycles and low degree trees is known to be close to 1: we
extend these results to `treelike' graphs, where the product of treewidth and
maximum degree is much less than the number of edges. This yields for example
the (deterministic) lower bound mentioned above on the modularity of
random cubic graphs.Comment: 25 page
Communities and bottlenecks: Trees and treelike networks have high modularity
Much effort has gone into understanding the modular nature of complex
networks. Communities, also known as clusters or modules, are typically
considered to be densely interconnected groups of nodes that are only sparsely
connected to other groups in the network. Discovering high quality communities
is a difficult and important problem in a number of areas. The most popular
approach is the objective function known as modularity, used both to discover
communities and to measure their strength. To understand the modular structure
of networks it is then crucial to know how such functions evaluate different
topologies, what features they account for, and what implicit assumptions they
may make. We show that trees and treelike networks can have unexpectedly and
often arbitrarily high values of modularity. This is surprising since trees are
maximally sparse connected graphs and are not typically considered to possess
modular structure, yet the nonlocal null model used by modularity assigns low
probabilities, and thus high significance, to the densities of these sparse
tree communities. We further study the practical performance of popular methods
on model trees and on a genealogical data set and find that the discovered
communities also have very high modularity, often approaching its maximum
value. Statistical tests reveal the communities in trees to be significant, in
contrast with known results for partitions of sparse, random graphs.Comment: 9 pages, 5 figure
Put three and three together: Triangle-driven community detection
Community detection has arisen as one of the most relevant topics in the field of graph data mining due to its applications in many fields such as biology, social networks, or network traffic analysis. Although the existing metrics used to quantify the quality of a community work well in general, under some circumstances, they fail at correctly capturing such notion. The main reason is that these metrics consider the internal community edges as a set, but ignore how these actually connect the vertices of the community. We propose the Weighted Community Clustering (WCC), which is a new community metric that takes the triangle instead of the edge as the minimal structural motif indicating the presence of a strong relation in a graph. We theoretically analyse WCC in depth and formally prove, by means of a set of properties, that the maximization of WCC guarantees communities with cohesion and structure. In addition, we propose Scalable Community Detection (SCD), a community detection algorithm based on WCC, which is designed to be fast and scalable on SMP machines, showing experimentally that WCC correctly captures the concept of community in social networks using real datasets. Finally, using ground-truth data, we show that SCD provides better quality than the best disjoint community detection algorithms of the state of the art while performing faster.Peer ReviewedPostprint (author's final draft
Spectral redemption: clustering sparse networks
Spectral algorithms are classic approaches to clustering and community
detection in networks. However, for sparse networks the standard versions of
these algorithms are suboptimal, in some cases completely failing to detect
communities even when other algorithms such as belief propagation can do so.
Here we introduce a new class of spectral algorithms based on a
non-backtracking walk on the directed edges of the graph. The spectrum of this
operator is much better-behaved than that of the adjacency matrix or other
commonly used matrices, maintaining a strong separation between the bulk
eigenvalues and the eigenvalues relevant to community structure even in the
sparse case. We show that our algorithm is optimal for graphs generated by the
stochastic block model, detecting communities all the way down to the
theoretical limit. We also show the spectrum of the non-backtracking operator
for some real-world networks, illustrating its advantages over traditional
spectral clustering.Comment: 11 pages, 6 figures. Clarified to what extent our claims are
rigorous, and to what extent they are conjectures; also added an
interpretation of the eigenvectors of the 2n-dimensional version of the
non-backtracking matri
Modularity of minor-free graphs
We prove that a class of graphs with an excluded minor and with the maximum
degree sublinear in the number of edges is maximally modular, that is,
modularity tends to 1 as the number of edges tends to infinity.Comment: 7 pages, 1 figur
The parameterised complexity of computing the maximum modularity of a graph
The maximum modularity of a graph is a parameter widely used to describe the level of clustering or community structure in a network. Determining the maximum modularity of a graph is known to be NP-complete in general, and in practice a range of heuristics are used to construct partitions of the vertex-set which give lower bounds on the maximum modularity but without any guarantee on how close these bounds are to the true maximum. In this paper we investigate the parameterised complexity of determining the maximum modularity with respect to various standard structural parameterisations of the input graph G. We show that the problem belongs to FPT when parameterised by the size of a minimum vertex cover for G, and is solvable in polynomial time whenever the treewidth or max leaf number of G is bounded by some fixed constant; we also obtain an FPT algorithm, parameterised by treewidth, to compute any constant-factor approximation to the maximum modularity. On the other hand we show that the problem is W[1]-hard (and hence unlikely to admit an FPT algorithm) when parameterised simultaneously by pathwidth and the size of a minimum feedback vertex set
On the modularity of 3-regular random graphs and random graphs with given degree sequences
The modularity of a graph is a parameter introduced by Newman and Girvan
measuring its community structure; the higher its value (between and ),
the more clustered a graph is.
In this paper we show that the modularity of a random regular graph is at
least asymptotically almost surely (a.a.s.), thereby proving a
conjecture of McDiarmid and Skerman stating that a random regular graph has
modularity strictly larger than a.a.s. We also improve the upper
bound given therein by showing that the modularity of such a graph is a.a.s. at
most .
For a uniformly chosen graph over a given bounded degree sequence with
average degree and with many connected components, we
distinguish two regimes with respect to the existence of a giant component. In
more detail, we precisely compute the second term of the modularity in the
subcritical regime. In the supercritical regime, we further prove that there is
depending on the degree sequence, for which the modularity is
a.a.s. at least \begin{equation*} \dfrac{2\left(1 -
\mu\right)}{d(G_n)}+\varepsilon, \end{equation*} where is the
asymptotically almost sure limit of .Comment: 41 page
Recommended from our members
Determinism and boundedness of self-assembling structures.
Self-assembly processes are widespread in nature and lie at the heart of many biological and physical phenomena. The characteristics of self-assembly building blocks determine the structures that they form. Two crucial properties are the determinism and boundedness of the self-assembly. The former tells us whether the same set of building blocks always generates the same structure, and the latter whether it grows indefinitely. These properties are highly relevant in the context of protein structures, as the difference between deterministic protein self-assembly and nondeterministic protein aggregation is central to a number of diseases. Here we introduce a graph theoretical approach that can determine the determinism and boundedness for several geometries and dimensionalities of self-assembly more accurately and quickly than conventional methods. We apply this methodology to a previously studied lattice self-assembly model and discuss generalizations to a wide range of other self-assembling systems
Critical phenomena in complex networks
The combination of the compactness of networks, featuring small diameters,
and their complex architectures results in a variety of critical effects
dramatically different from those in cooperative systems on lattices. In the
last few years, researchers have made important steps toward understanding the
qualitatively new critical phenomena in complex networks. We review the
results, concepts, and methods of this rapidly developing field. Here we mostly
consider two closely related classes of these critical phenomena, namely
structural phase transitions in the network architectures and transitions in
cooperative models on networks as substrates. We also discuss systems where a
network and interacting agents on it influence each other. We overview a wide
range of critical phenomena in equilibrium and growing networks including the
birth of the giant connected component, percolation, k-core percolation,
phenomena near epidemic thresholds, condensation transitions, critical
phenomena in spin models placed on networks, synchronization, and
self-organized criticality effects in interacting systems on networks. We also
discuss strong finite size effects in these systems and highlight open problems
and perspectives.Comment: Review article, 79 pages, 43 figures, 1 table, 508 references,
extende
- …