53,067 research outputs found
Link communities reveal multiscale complexity in networks
Networks have become a key approach to understanding systems of interacting
objects, unifying the study of diverse phenomena including biological organisms
and human society. One crucial step when studying the structure and dynamics of
networks is to identify communities: groups of related nodes that correspond to
functional subunits such as protein complexes or social spheres. Communities in
networks often overlap such that nodes simultaneously belong to several groups.
Meanwhile, many networks are known to possess hierarchical organization, where
communities are recursively grouped into a hierarchical structure. However, the
fact that many real networks have communities with pervasive overlap, where
each and every node belongs to more than one group, has the consequence that a
global hierarchy of nodes cannot capture the relationships between overlapping
groups. Here we reinvent communities as groups of links rather than nodes and
show that this unorthodox approach successfully reconciles the antagonistic
organizing principles of overlapping communities and hierarchy. In contrast to
the existing literature, which has entirely focused on grouping nodes, link
communities naturally incorporate overlap while revealing hierarchical
organization. We find relevant link communities in many networks, including
major biological networks such as protein-protein interaction and metabolic
networks, and show that a large social network contains hierarchically
organized community structures spanning inner-city to regional scales while
maintaining pervasive overlap. Our results imply that link communities are
fundamental building blocks that reveal overlap and hierarchical organization
in networks to be two aspects of the same phenomenon.Comment: Main text and supplementary informatio
Identifying Overlapping and Hierarchical Thematic Structures in Networks of Scholarly Papers: A Comparison of Three Approaches
We implemented three recently proposed approaches to the identification of
overlapping and hierarchical substructures in graphs and applied the
corresponding algorithms to a network of 492 information-science papers coupled
via their cited sources. The thematic substructures obtained and overlaps
produced by the three hierarchical cluster algorithms were compared to a
content-based categorisation, which we based on the interpretation of titles
and keywords. We defined sets of papers dealing with three topics located on
different levels of aggregation: h-index, webometrics, and bibliometrics. We
identified these topics with branches in the dendrograms produced by the three
cluster algorithms and compared the overlapping topics they detected with one
another and with the three pre-defined paper sets. We discuss the advantages
and drawbacks of applying the three approaches to paper networks in research
fields.Comment: 18 pages, 9 figure
Clustering Memes in Social Media
The increasing pervasiveness of social media creates new opportunities to
study human social behavior, while challenging our capability to analyze their
massive data streams. One of the emerging tasks is to distinguish between
different kinds of activities, for example engineered misinformation campaigns
versus spontaneous communication. Such detection problems require a formal
definition of meme, or unit of information that can spread from person to
person through the social network. Once a meme is identified, supervised
learning methods can be applied to classify different types of communication.
The appropriate granularity of a meme, however, is hardly captured from
existing entities such as tags and keywords. Here we present a framework for
the novel task of detecting memes by clustering messages from large streams of
social data. We evaluate various similarity measures that leverage content,
metadata, network features, and their combinations. We also explore the idea of
pre-clustering on the basis of existing entities. A systematic evaluation is
carried out using a manually curated dataset as ground truth. Our analysis
shows that pre-clustering and a combination of heterogeneous features yield the
best trade-off between number of clusters and their quality, demonstrating that
a simple combination based on pairwise maximization of similarity is as
effective as a non-trivial optimization of parameters. Our approach is fully
automatic, unsupervised, and scalable for real-time detection of memes in
streaming data.Comment: Proceedings of the 2013 IEEE/ACM International Conference on Advances
in Social Networks Analysis and Mining (ASONAM'13), 201
Element-centric clustering comparison unifies overlaps and hierarchy
Clustering is one of the most universal approaches for understanding complex
data. A pivotal aspect of clustering analysis is quantitatively comparing
clusterings; clustering comparison is the basis for many tasks such as
clustering evaluation, consensus clustering, and tracking the temporal
evolution of clusters. In particular, the extrinsic evaluation of clustering
methods requires comparing the uncovered clusterings to planted clusterings or
known metadata. Yet, as we demonstrate, existing clustering comparison measures
have critical biases which undermine their usefulness, and no measure
accommodates both overlapping and hierarchical clusterings. Here we unify the
comparison of disjoint, overlapping, and hierarchically structured clusterings
by proposing a new element-centric framework: elements are compared based on
the relationships induced by the cluster structure, as opposed to the
traditional cluster-centric philosophy. We demonstrate that, in contrast to
standard clustering similarity measures, our framework does not suffer from
critical biases and naturally provides unique insights into how the clusterings
differ. We illustrate the strengths of our framework by revealing new insights
into the organization of clusters in two applications: the improved
classification of schizophrenia based on the overlapping and hierarchical
community structure of fMRI brain networks, and the disentanglement of various
social homophily factors in Facebook social networks. The universality of
clustering suggests far-reaching impact of our framework throughout all areas
of science
Modularity-Based Clustering for Network-Constrained Trajectories
We present a novel clustering approach for moving object trajectories that
are constrained by an underlying road network. The approach builds a similarity
graph based on these trajectories then uses modularity-optimization hiearchical
graph clustering to regroup trajectories with similar profiles. Our
experimental study shows the superiority of the proposed approach over classic
hierarchical clustering and gives a brief insight to visualization of the
clustering results.Comment: 20-th European Symposium on Artificial Neural Networks, Computational
Intelligence and Machine Learning (ESANN 2012), Bruges : Belgium (2012
Analyzing overlapping communities in networks using link communities
One way to analyze the structure of a network is to identify its communities, groups of related nodes that are more likely to connect to one another than to nodes outside the community. Commonly used algorithms for obtaining a network’s communities rely on clustering of the network’s nodes into a community structure that maximizes an appropriate objective function. However, defining communities as a partition of a network’s nodes, and thus stipulating that each node belongs to exactly one community, precludes the detection of overlapping communities that may exist in the network. Here we show that by defining communities as partition of a network’s links, and thus allowing individual nodes to appear in multiple communities, we can quantify the extent to which each pair of communities in a network overlaps. We define two measures of community overlap and apply them to the community structure of five networks from different disciplines. In every case, we note that there are many pairs of communities that share a significant number of nodes. This highlights a major advantage of using link partitioning, as opposed to node partitioning, when seeking to understand the community structure of a network. We also observe significant differences between overlap statistics in real-world networks as compared with randomly-generated null models. By virtue of their contexts, we expect many naturally-occurring networks to have very densely overlapping communities. Therefore, it is necessary to develop an understanding of how to use community overlap calculations to draw conclusions about the underlying structure of a network
- …