10,379 research outputs found
Detecting Cohesive and 2-mode Communities in Directed and Undirected Networks
Networks are a general language for representing relational information among
objects. An effective way to model, reason about, and summarize networks, is to
discover sets of nodes with common connectivity patterns. Such sets are
commonly referred to as network communities. Research on network community
detection has predominantly focused on identifying communities of densely
connected nodes in undirected networks.
In this paper we develop a novel overlapping community detection method that
scales to networks of millions of nodes and edges and advances research along
two dimensions: the connectivity structure of communities, and the use of edge
directedness for community detection. First, we extend traditional definitions
of network communities by building on the observation that nodes can be densely
interlinked in two different ways: In cohesive communities nodes link to each
other, while in 2-mode communities nodes link in a bipartite fashion, where
links predominate between the two partitions rather than inside them. Our
method successfully detects both 2-mode as well as cohesive communities, that
may also overlap or be hierarchically nested. Second, while most existing
community detection methods treat directed edges as though they were
undirected, our method accounts for edge directions and is able to identify
novel and meaningful community structures in both directed and undirected
networks, using data from social, biological, and ecological domains.Comment: Published in the proceedings of WSDM '1
A semi-supervised approach to visualizing and manipulating overlapping communities
When evaluating a network topology, occasionally data structures cannot be segmented into absolute, heterogeneous groups. There may be a spectrum to the dataset that does not allow for this hard clustering approach and may need to segment using fuzzy/overlapping communities or cliques. Even to this degree, when group members can belong to multiple cliques, there leaves an ever present layer of doubt, noise, and outliers caused by the overlapping clustering algorithms. These imperfections can either be corrected by an expert user to enhance the clustering algorithm or to preserve their own mental models of the communities. Presented is a visualization that models overlapping community membership and provides an interactive interface to facilitate a quick and efficient means of both sorting through large network topologies and preserving the user's mental model of the structure. © 2013 IEEE
An efficient and principled method for detecting communities in networks
A fundamental problem in the analysis of network data is the detection of
network communities, groups of densely interconnected nodes, which may be
overlapping or disjoint. Here we describe a method for finding overlapping
communities based on a principled statistical approach using generative network
models. We show how the method can be implemented using a fast, closed-form
expectation-maximization algorithm that allows us to analyze networks of
millions of nodes in reasonable running times. We test the method both on
real-world networks and on synthetic benchmarks and find that it gives results
competitive with previous methods. We also show that the same approach can be
used to extract nonoverlapping community divisions via a relaxation method, and
demonstrate that the algorithm is competitively fast and accurate for the
nonoverlapping problem.Comment: 14 pages, 5 figures, 1 tabl
Searching for network modules
When analyzing complex networks a key target is to uncover their modular
structure, which means searching for a family of modules, namely node subsets
spanning each a subnetwork more densely connected than the average. This work
proposes a novel type of objective function for graph clustering, in the form
of a multilinear polynomial whose coefficients are determined by network
topology. It may be thought of as a potential function, to be maximized, taking
its values on fuzzy clusterings or families of fuzzy subsets of nodes over
which every node distributes a unit membership. When suitably parametrized,
this potential is shown to attain its maximum when every node concentrates its
all unit membership on some module. The output thus is a partition, while the
original discrete optimization problem is turned into a continuous version
allowing to conceive alternative search strategies. The instance of the problem
being a pseudo-Boolean function assigning real-valued cluster scores to node
subsets, modularity maximization is employed to exemplify a so-called quadratic
form, in that the scores of singletons and pairs also fully determine the
scores of larger clusters, while the resulting multilinear polynomial potential
function has degree 2. After considering further quadratic instances, different
from modularity and obtained by interpreting network topology in alternative
manners, a greedy local-search strategy for the continuous framework is
analytically compared with an existing greedy agglomerative procedure for the
discrete case. Overlapping is finally discussed in terms of multiple runs, i.e.
several local searches with different initializations.Comment: 10 page
- …