10,379 research outputs found

    Detecting Cohesive and 2-mode Communities in Directed and Undirected Networks

    Full text link
    Networks are a general language for representing relational information among objects. An effective way to model, reason about, and summarize networks, is to discover sets of nodes with common connectivity patterns. Such sets are commonly referred to as network communities. Research on network community detection has predominantly focused on identifying communities of densely connected nodes in undirected networks. In this paper we develop a novel overlapping community detection method that scales to networks of millions of nodes and edges and advances research along two dimensions: the connectivity structure of communities, and the use of edge directedness for community detection. First, we extend traditional definitions of network communities by building on the observation that nodes can be densely interlinked in two different ways: In cohesive communities nodes link to each other, while in 2-mode communities nodes link in a bipartite fashion, where links predominate between the two partitions rather than inside them. Our method successfully detects both 2-mode as well as cohesive communities, that may also overlap or be hierarchically nested. Second, while most existing community detection methods treat directed edges as though they were undirected, our method accounts for edge directions and is able to identify novel and meaningful community structures in both directed and undirected networks, using data from social, biological, and ecological domains.Comment: Published in the proceedings of WSDM '1

    A semi-supervised approach to visualizing and manipulating overlapping communities

    Get PDF
    When evaluating a network topology, occasionally data structures cannot be segmented into absolute, heterogeneous groups. There may be a spectrum to the dataset that does not allow for this hard clustering approach and may need to segment using fuzzy/overlapping communities or cliques. Even to this degree, when group members can belong to multiple cliques, there leaves an ever present layer of doubt, noise, and outliers caused by the overlapping clustering algorithms. These imperfections can either be corrected by an expert user to enhance the clustering algorithm or to preserve their own mental models of the communities. Presented is a visualization that models overlapping community membership and provides an interactive interface to facilitate a quick and efficient means of both sorting through large network topologies and preserving the user's mental model of the structure. © 2013 IEEE

    An efficient and principled method for detecting communities in networks

    Full text link
    A fundamental problem in the analysis of network data is the detection of network communities, groups of densely interconnected nodes, which may be overlapping or disjoint. Here we describe a method for finding overlapping communities based on a principled statistical approach using generative network models. We show how the method can be implemented using a fast, closed-form expectation-maximization algorithm that allows us to analyze networks of millions of nodes in reasonable running times. We test the method both on real-world networks and on synthetic benchmarks and find that it gives results competitive with previous methods. We also show that the same approach can be used to extract nonoverlapping community divisions via a relaxation method, and demonstrate that the algorithm is competitively fast and accurate for the nonoverlapping problem.Comment: 14 pages, 5 figures, 1 tabl

    Searching for network modules

    Full text link
    When analyzing complex networks a key target is to uncover their modular structure, which means searching for a family of modules, namely node subsets spanning each a subnetwork more densely connected than the average. This work proposes a novel type of objective function for graph clustering, in the form of a multilinear polynomial whose coefficients are determined by network topology. It may be thought of as a potential function, to be maximized, taking its values on fuzzy clusterings or families of fuzzy subsets of nodes over which every node distributes a unit membership. When suitably parametrized, this potential is shown to attain its maximum when every node concentrates its all unit membership on some module. The output thus is a partition, while the original discrete optimization problem is turned into a continuous version allowing to conceive alternative search strategies. The instance of the problem being a pseudo-Boolean function assigning real-valued cluster scores to node subsets, modularity maximization is employed to exemplify a so-called quadratic form, in that the scores of singletons and pairs also fully determine the scores of larger clusters, while the resulting multilinear polynomial potential function has degree 2. After considering further quadratic instances, different from modularity and obtained by interpreting network topology in alternative manners, a greedy local-search strategy for the continuous framework is analytically compared with an existing greedy agglomerative procedure for the discrete case. Overlapping is finally discussed in terms of multiple runs, i.e. several local searches with different initializations.Comment: 10 page
    corecore