1,377 research outputs found
Clustering and Community Detection in Directed Networks: A Survey
Networks (or graphs) appear as dominant structures in diverse domains,
including sociology, biology, neuroscience and computer science. In most of the
aforementioned cases graphs are directed - in the sense that there is
directionality on the edges, making the semantics of the edges non symmetric.
An interesting feature that real networks present is the clustering or
community structure property, under which the graph topology is organized into
modules commonly called communities or clusters. The essence here is that nodes
of the same community are highly similar while on the contrary, nodes across
communities present low similarity. Revealing the underlying community
structure of directed complex networks has become a crucial and
interdisciplinary topic with a plethora of applications. Therefore, naturally
there is a recent wealth of research production in the area of mining directed
graphs - with clustering being the primary method and tool for community
detection and evaluation. The goal of this paper is to offer an in-depth review
of the methods presented so far for clustering directed networks along with the
relevant necessary methodological background and also related applications. The
survey commences by offering a concise review of the fundamental concepts and
methodological base on which graph clustering algorithms capitalize on. Then we
present the relevant work along two orthogonal classifications. The first one
is mostly concerned with the methodological principles of the clustering
algorithms, while the second one approaches the methods from the viewpoint
regarding the properties of a good cluster in a directed network. Further, we
present methods and metrics for evaluating graph clustering results,
demonstrate interesting application domains and provide promising future
research directions.Comment: 86 pages, 17 figures. Physics Reports Journal (To Appear
Overlapping modularity at the critical point of k-clique percolation
One of the most remarkable social phenomena is the formation of communities
in social networks corresponding to families, friendship circles, work teams,
etc. Since people usually belong to several different communities at the same
time, the induced overlaps result in an extremely complicated web of the
communities themselves. Thus, uncovering the intricate community structure of
social networks is a non-trivial task with great potential for practical
applications, gaining a notable interest in the recent years. The Clique
Percolation Method (CPM) is one of the earliest overlapping community finding
methods, which was already used in the analysis of several different social
networks. In this approach the communities correspond to k-clique percolation
clusters, and the general heuristic for setting the parameters of the method is
to tune the system just below the critical point of k-clique percolation.
However, this rule is based on simple physical principles and its validity was
never subject to quantitative analysis. Here we examine the quality of the
partitioning in the vicinity of the critical point using recently introduced
overlapping modularity measures. According to our results on real social- and
other networks, the overlapping modularities show a maximum close to the
critical point, justifying the original criteria for the optimal parameter
settings.Comment: 20 pages, 6 figure
Ground truth? Concept-based communities versus the external classification of physics manuscripts
Community detection techniques are widely used to infer hidden structures
within interconnected systems. Despite demonstrating high accuracy on
benchmarks, they reproduce the external classification for many real-world
systems with a significant level of discrepancy. A widely accepted reason
behind such outcome is the unavoidable loss of non-topological information
(such as node attributes) encountered when the original complex system is
represented as a network. In this article we emphasize that the observed
discrepancies may also be caused by a different reason: the external
classification itself. For this end we use scientific publication data which i)
exhibit a well defined modular structure and ii) hold an expert-made
classification of research articles. Having represented the articles and the
extracted scientific concepts both as a bipartite network and as its unipartite
projection, we applied modularity optimization to uncover the inner thematic
structure. The resulting clusters are shown to partly reflect the author-made
classification, although some significant discrepancies are observed. A
detailed analysis of these discrepancies shows that they carry essential
information about the system, mainly related to the use of similar techniques
and methods across different (sub)disciplines, that is otherwise omitted when
only the external classification is considered.Comment: 15 pages, 2 figure
Detecting Cohesive and 2-mode Communities in Directed and Undirected Networks
Networks are a general language for representing relational information among
objects. An effective way to model, reason about, and summarize networks, is to
discover sets of nodes with common connectivity patterns. Such sets are
commonly referred to as network communities. Research on network community
detection has predominantly focused on identifying communities of densely
connected nodes in undirected networks.
In this paper we develop a novel overlapping community detection method that
scales to networks of millions of nodes and edges and advances research along
two dimensions: the connectivity structure of communities, and the use of edge
directedness for community detection. First, we extend traditional definitions
of network communities by building on the observation that nodes can be densely
interlinked in two different ways: In cohesive communities nodes link to each
other, while in 2-mode communities nodes link in a bipartite fashion, where
links predominate between the two partitions rather than inside them. Our
method successfully detects both 2-mode as well as cohesive communities, that
may also overlap or be hierarchically nested. Second, while most existing
community detection methods treat directed edges as though they were
undirected, our method accounts for edge directions and is able to identify
novel and meaningful community structures in both directed and undirected
networks, using data from social, biological, and ecological domains.Comment: Published in the proceedings of WSDM '1
Community detection in multiplex networks using locally adaptive random walks
Multiplex networks, a special type of multilayer networks, are increasingly
applied in many domains ranging from social media analytics to biology. A
common task in these applications concerns the detection of community
structures. Many existing algorithms for community detection in multiplexes
attempt to detect communities which are shared by all layers. In this article
we propose a community detection algorithm, LART (Locally Adaptive Random
Transitions), for the detection of communities that are shared by either some
or all the layers in the multiplex. The algorithm is based on a random walk on
the multiplex, and the transition probabilities defining the random walk are
allowed to depend on the local topological similarity between layers at any
given node so as to facilitate the exploration of communities across layers.
Based on this random walk, a node dissimilarity measure is derived and nodes
are clustered based on this distance in a hierarchical fashion. We present
experimental results using networks simulated under various scenarios to
showcase the performance of LART in comparison to related community detection
algorithms
- …