7,229 research outputs found
Community Detection in Networks with Node Attributes
Community detection algorithms are fundamental tools that allow us to uncover
organizational principles in networks. When detecting communities, there are
two possible sources of information one can use: the network structure, and the
features and attributes of nodes. Even though communities form around nodes
that have common edges and common attributes, typically, algorithms have only
focused on one of these two data modalities: community detection algorithms
traditionally focus only on the network structure, while clustering algorithms
mostly consider only node attributes. In this paper, we develop Communities
from Edge Structure and Node Attributes (CESNA), an accurate and scalable
algorithm for detecting overlapping communities in networks with node
attributes. CESNA statistically models the interaction between the network
structure and the node attributes, which leads to more accurate community
detection as well as improved robustness in the presence of noise in the
network structure. CESNA has a linear runtime in the network size and is able
to process networks an order of magnitude larger than comparable approaches.
Last, CESNA also helps with the interpretation of detected communities by
finding relevant node attributes for each community.Comment: Published in the proceedings of IEEE ICDM '1
A framework for community detection in heterogeneous multi-relational networks
There has been a surge of interest in community detection in homogeneous
single-relational networks which contain only one type of nodes and edges.
However, many real-world systems are naturally described as heterogeneous
multi-relational networks which contain multiple types of nodes and edges. In
this paper, we propose a new method for detecting communities in such networks.
Our method is based on optimizing the composite modularity, which is a new
modularity proposed for evaluating partitions of a heterogeneous
multi-relational network into communities. Our method is parameter-free,
scalable, and suitable for various networks with general structure. We
demonstrate that it outperforms the state-of-the-art techniques in detecting
pre-planted communities in synthetic networks. Applied to a real-world Digg
network, it successfully detects meaningful communities.Comment: 27 pages, 10 figure
Communities, Knowledge Creation, and Information Diffusion
In this paper, we examine how patterns of scientific collaboration contribute
to knowledge creation. Recent studies have shown that scientists can benefit
from their position within collaborative networks by being able to receive more
information of better quality in a timely fashion, and by presiding over
communication between collaborators. Here we focus on the tendency of
scientists to cluster into tightly-knit communities, and discuss the
implications of this tendency for scientific performance. We begin by reviewing
a new method for finding communities, and we then assess its benefits in terms
of computation time and accuracy. While communities often serve as a taxonomic
scheme to map knowledge domains, they also affect how successfully scientists
engage in the creation of new knowledge. By drawing on the longstanding debate
on the relative benefits of social cohesion and brokerage, we discuss the
conditions that facilitate collaborations among scientists within or across
communities. We show that successful scientific production occurs within
communities when scientists have cohesive collaborations with others from the
same knowledge domain, and across communities when scientists intermediate
among otherwise disconnected collaborators from different knowledge domains. We
also discuss the implications of communities for information diffusion, and
show how traditional epidemiological approaches need to be refined to take
knowledge heterogeneity into account and preserve the system's ability to
promote creative processes of novel recombinations of idea
Community structure and patterns of scientific collaboration in Business and Management
This is the author's accepted version of this article deposited at arXiv (arXiv:1006.1788v2 [physics.soc-ph]) and subsequently published in Scientometrics October 2011, Volume 89, Issue 1, pp 381-396. The final publication is available at link.springer.com http://link.springer.com/article/10.1007%2Fs11192-011-0439-1Author's note: 17 pages. To appear in special edition of Scientometrics. Abstract on arXiv meta-data a shorter version of abstract on actual paper (both in journal and arXiv full pape
Statistically validated networks in bipartite complex systems
Many complex systems present an intrinsic bipartite nature and are often
described and modeled in terms of networks [1-5]. Examples include movies and
actors [1, 2, 4], authors and scientific papers [6-9], email accounts and
emails [10], plants and animals that pollinate them [11, 12]. Bipartite
networks are often very heterogeneous in the number of relationships that the
elements of one set establish with the elements of the other set. When one
constructs a projected network with nodes from only one set, the system
heterogeneity makes it very difficult to identify preferential links between
the elements. Here we introduce an unsupervised method to statistically
validate each link of the projected network against a null hypothesis taking
into account the heterogeneity of the system. We apply our method to three
different systems, namely the set of clusters of orthologous genes (COG) in
completely sequenced genomes [13, 14], a set of daily returns of 500 US
financial stocks, and the set of world movies of the IMDb database [15]. In all
these systems, both different in size and level of heterogeneity, we find that
our method is able to detect network structures which are informative about the
system and are not simply expression of its heterogeneity. Specifically, our
method (i) identifies the preferential relationships between the elements, (ii)
naturally highlights the clustered structure of investigated systems, and (iii)
allows to classify links according to the type of statistically validated
relationships between the connected nodes.Comment: Main text: 13 pages, 3 figures, and 1 Table. Supplementary
information: 15 pages, 3 figures, and 2 Table
- …