7,229 research outputs found

    Community Detection in Networks with Node Attributes

    Full text link
    Community detection algorithms are fundamental tools that allow us to uncover organizational principles in networks. When detecting communities, there are two possible sources of information one can use: the network structure, and the features and attributes of nodes. Even though communities form around nodes that have common edges and common attributes, typically, algorithms have only focused on one of these two data modalities: community detection algorithms traditionally focus only on the network structure, while clustering algorithms mostly consider only node attributes. In this paper, we develop Communities from Edge Structure and Node Attributes (CESNA), an accurate and scalable algorithm for detecting overlapping communities in networks with node attributes. CESNA statistically models the interaction between the network structure and the node attributes, which leads to more accurate community detection as well as improved robustness in the presence of noise in the network structure. CESNA has a linear runtime in the network size and is able to process networks an order of magnitude larger than comparable approaches. Last, CESNA also helps with the interpretation of detected communities by finding relevant node attributes for each community.Comment: Published in the proceedings of IEEE ICDM '1

    A framework for community detection in heterogeneous multi-relational networks

    Full text link
    There has been a surge of interest in community detection in homogeneous single-relational networks which contain only one type of nodes and edges. However, many real-world systems are naturally described as heterogeneous multi-relational networks which contain multiple types of nodes and edges. In this paper, we propose a new method for detecting communities in such networks. Our method is based on optimizing the composite modularity, which is a new modularity proposed for evaluating partitions of a heterogeneous multi-relational network into communities. Our method is parameter-free, scalable, and suitable for various networks with general structure. We demonstrate that it outperforms the state-of-the-art techniques in detecting pre-planted communities in synthetic networks. Applied to a real-world Digg network, it successfully detects meaningful communities.Comment: 27 pages, 10 figure

    Communities, Knowledge Creation, and Information Diffusion

    Get PDF
    In this paper, we examine how patterns of scientific collaboration contribute to knowledge creation. Recent studies have shown that scientists can benefit from their position within collaborative networks by being able to receive more information of better quality in a timely fashion, and by presiding over communication between collaborators. Here we focus on the tendency of scientists to cluster into tightly-knit communities, and discuss the implications of this tendency for scientific performance. We begin by reviewing a new method for finding communities, and we then assess its benefits in terms of computation time and accuracy. While communities often serve as a taxonomic scheme to map knowledge domains, they also affect how successfully scientists engage in the creation of new knowledge. By drawing on the longstanding debate on the relative benefits of social cohesion and brokerage, we discuss the conditions that facilitate collaborations among scientists within or across communities. We show that successful scientific production occurs within communities when scientists have cohesive collaborations with others from the same knowledge domain, and across communities when scientists intermediate among otherwise disconnected collaborators from different knowledge domains. We also discuss the implications of communities for information diffusion, and show how traditional epidemiological approaches need to be refined to take knowledge heterogeneity into account and preserve the system's ability to promote creative processes of novel recombinations of idea

    Community structure and patterns of scientific collaboration in Business and Management

    Get PDF
    This is the author's accepted version of this article deposited at arXiv (arXiv:1006.1788v2 [physics.soc-ph]) and subsequently published in Scientometrics October 2011, Volume 89, Issue 1, pp 381-396. The final publication is available at link.springer.com http://link.springer.com/article/10.1007%2Fs11192-011-0439-1Author's note: 17 pages. To appear in special edition of Scientometrics. Abstract on arXiv meta-data a shorter version of abstract on actual paper (both in journal and arXiv full pape

    Statistically validated networks in bipartite complex systems

    Get PDF
    Many complex systems present an intrinsic bipartite nature and are often described and modeled in terms of networks [1-5]. Examples include movies and actors [1, 2, 4], authors and scientific papers [6-9], email accounts and emails [10], plants and animals that pollinate them [11, 12]. Bipartite networks are often very heterogeneous in the number of relationships that the elements of one set establish with the elements of the other set. When one constructs a projected network with nodes from only one set, the system heterogeneity makes it very difficult to identify preferential links between the elements. Here we introduce an unsupervised method to statistically validate each link of the projected network against a null hypothesis taking into account the heterogeneity of the system. We apply our method to three different systems, namely the set of clusters of orthologous genes (COG) in completely sequenced genomes [13, 14], a set of daily returns of 500 US financial stocks, and the set of world movies of the IMDb database [15]. In all these systems, both different in size and level of heterogeneity, we find that our method is able to detect network structures which are informative about the system and are not simply expression of its heterogeneity. Specifically, our method (i) identifies the preferential relationships between the elements, (ii) naturally highlights the clustered structure of investigated systems, and (iii) allows to classify links according to the type of statistically validated relationships between the connected nodes.Comment: Main text: 13 pages, 3 figures, and 1 Table. Supplementary information: 15 pages, 3 figures, and 2 Table
    • …
    corecore