1,912 research outputs found
A smart local moving algorithm for large-scale modularity-based community detection
We introduce a new algorithm for modularity-based community detection in
large networks. The algorithm, which we refer to as a smart local moving
algorithm, takes advantage of a well-known local moving heuristic that is also
used by other algorithms. Compared with these other algorithms, our proposed
algorithm uses the local moving heuristic in a more sophisticated way. Based on
an analysis of a diverse set of networks, we show that our smart local moving
algorithm identifies community structures with higher modularity values than
other algorithms for large-scale modularity optimization, among which the
popular 'Louvain algorithm' introduced by Blondel et al. (2008). The
computational efficiency of our algorithm makes it possible to perform
community detection in networks with tens of millions of nodes and hundreds of
millions of edges. Our smart local moving algorithm also performs well in small
and medium-sized networks. In short computing times, it identifies community
structures with modularity values equally high as, or almost as high as, the
highest values reported in the literature, and sometimes even higher than the
highest values found in the literature
Analysis of Network Clustering Algorithms and Cluster Quality Metrics at Scale
Notions of community quality underlie network clustering. While studies
surrounding network clustering are increasingly common, a precise understanding
of the realtionship between different cluster quality metrics is unknown. In
this paper, we examine the relationship between stand-alone cluster quality
metrics and information recovery metrics through a rigorous analysis of four
widely-used network clustering algorithms -- Louvain, Infomap, label
propagation, and smart local moving. We consider the stand-alone quality
metrics of modularity, conductance, and coverage, and we consider the
information recovery metrics of adjusted Rand score, normalized mutual
information, and a variant of normalized mutual information used in previous
work. Our study includes both synthetic graphs and empirical data sets of sizes
varying from 1,000 to 1,000,000 nodes.
We find significant differences among the results of the different cluster
quality metrics. For example, clustering algorithms can return a value of 0.4
out of 1 on modularity but score 0 out of 1 on information recovery. We find
conductance, though imperfect, to be the stand-alone quality metric that best
indicates performance on information recovery metrics. Our study shows that the
variant of normalized mutual information used in previous work cannot be
assumed to differ only slightly from traditional normalized mutual information.
Smart local moving is the best performing algorithm in our study, but
discrepancies between cluster evaluation metrics prevent us from declaring it
absolutely superior. Louvain performed better than Infomap in nearly all the
tests in our study, contradicting the results of previous work in which Infomap
was superior to Louvain. We find that although label propagation performs
poorly when clusters are less clearly defined, it scales efficiently and
accurately to large graphs with well-defined clusters
A Fast and Efficient Incremental Approach toward Dynamic Community Detection
Community detection is a discovery tool used by network scientists to analyze
the structure of real-world networks. It seeks to identify natural divisions
that may exist in the input networks that partition the vertices into coherent
modules (or communities). While this problem space is rich with efficient
algorithms and software, most of this literature caters to the static use-case
where the underlying network does not change. However, many emerging real-world
use-cases give rise to a need to incorporate dynamic graphs as inputs.
In this paper, we present a fast and efficient incremental approach toward
dynamic community detection. The key contribution is a generic technique called
, which examines the most recent batch of changes made to an
input graph and selects a subset of vertices to reevaluate for potential
community (re)assignment. This technique can be incorporated into any of the
community detection methods that use modularity as its objective function for
clustering. For demonstration purposes, we incorporated the technique into two
well-known community detection tools. Our experiments demonstrate that our new
incremental approach is able to generate performance speedups without
compromising on the output quality (despite its heuristic nature). For
instance, on a real-world network with 63M temporal edges (over 12 time steps),
our approach was able to complete in 1056 seconds, yielding a 3x speedup over a
baseline implementation. In addition to demonstrating the performance benefits,
we also show how to use our approach to delineate appropriate intervals of
temporal resolutions at which to analyze an input network
Generalized Markov stability of network communities
We address the problem of community detection in networks by introducing a
general definition of Markov stability, based on the difference between the
probability fluxes of a Markov chain on the network at different time scales.
The specific implementation of the quality function and the resulting optimal
community structure thus become dependent both on the type of Markov process
and on the specific Markov times considered. For instance, if we use a natural
Markov chain dynamics and discount its stationary distribution -- that is, we
take as reference process the dynamics at infinite time -- we obtain the
standard formulation of the Markov stability. Notably, the possibility to use
finite-time transition probabilities to define the reference process naturally
allows detecting communities at different resolutions, without the need to
consider a continuous-time Markov chain in the small time limit. The main
advantage of our general formulation of Markov stability based on dynamical
flows is that we work with lumped Markov chains on network partitions, having
the same stationary distribution of the original process. In this way the form
of the quality function becomes invariant under partitioning, leading to a
self-consistent definition of community structures at different aggregation
scales
Community Detection in Social Networks
The rise of the Internet has brought people closer. The number of interactions between people across the globe has gone substantially up due to social awareness, the advancements of the technology, and digital interaction. Social networking sites have built societies, communities virtually. Often these societies are displayed as a network of nodes depicting people and edges depicting relationships, links. This is a good and e cient way to store, model and represent systems which have a complex and rich information. Towards that goal we need to nd e ective, quick methods to analyze social networks. One of the possible solution is community detection. The community detection deals with nding clusters, groups in a network. Detecting such communities is very important in many elds in order to understand and extract the information from complex systems. The problem is very hard and has been studied extensively for the past few years. With this project, we will de ne the problem, study existing methods, propose new methods, and experimentally evaluate them using synthetic and real datasets. Additionally, we will describe applications to smart city communities and challenges that have to be resolved
- …