198,075 research outputs found
Characterizing the community structure of complex networks
Community structure is one of the key properties of complex networks and
plays a crucial role in their topology and function. While an impressive amount
of work has been done on the issue of community detection, very little
attention has been so far devoted to the investigation of communities in real
networks. We present a systematic empirical analysis of the statistical
properties of communities in large information, communication, technological,
biological, and social networks. We find that the mesoscopic organization of
networks of the same category is remarkably similar. This is reflected in
several characteristics of community structure, which can be used as
``fingerprints'' of specific network categories. While community size
distributions are always broad, certain categories of networks consist mainly
of tree-like communities, while others have denser modules. Average path
lengths within communities initially grow logarithmically with community size,
but the growth saturates or slows down for communities larger than a
characteristic size. This behaviour is related to the presence of hubs within
communities, whose roles differ across categories. Also the community
embeddedness of nodes, measured in terms of the fraction of links within their
communities, has a characteristic distribution for each category. Our findings
are verified by the use of two fundamentally different community detection
methods.Comment: 15 pages, 20 figures, 4 table
Recommended from our members
SOCIAL NETWORK ANALYSIS OF B2B NETWORKS
The volume of information readily available in the E-Marketplace is massive. Business-to-business (B2B) users, for instance, have an extensive chain of business relationships that have immensely generated a large volume of information. Although this raises problems of information overload, the data is embedded with rich and valuable information, such as internal structure and social networks. While other research focuses on discovering properties of B2C, C2C and P2P networks, there only exists limited work on B2B, attributable to the high complexity of the B2B structure. This paper presents a statistical analysis of B2B networks that amasses dispersed users' social relationships using a social network analysis technique. The investigation performed found that B2B networks are small-world, and thus follow a power-law distribution. The analysis also proved that B2B networks hold stable community structures.Malaysia Council of Trust for the Bumiputer
Social network analysis of B2B networks
The volume of information readily available in the EMarketplace is massive.Business-to-business (B2B) users, for instance, have an extensive chain of business relationships that have immensely generated a large volume of information. Although this raises problems of information overload, the data is embedded with rich and valuable information, such as internal structure and social networks.While other research focuses on discovering properties of B2C, C2C and P2P networks, there only exists limited work on B2B, attributable to the high complexity of the B2B structure.This paper presents a statistical analysis of B2B networks that amasses dispersed users' social relationships using a social network
analysis technique.The investigation performed found that B2B networks are small-world, and thus follow a power-law distribution. The analysis also
proved that B2B networks hold stable community structures
Algorithmic and Statistical Perspectives on Large-Scale Data Analysis
In recent years, ideas from statistics and scientific computing have begun to
interact in increasingly sophisticated and fruitful ways with ideas from
computer science and the theory of algorithms to aid in the development of
improved worst-case algorithms that are useful for large-scale scientific and
Internet data analysis problems. In this chapter, I will describe two recent
examples---one having to do with selecting good columns or features from a (DNA
Single Nucleotide Polymorphism) data matrix, and the other having to do with
selecting good clusters or communities from a data graph (representing a social
or information network)---that drew on ideas from both areas and that may serve
as a model for exploiting complementary algorithmic and statistical
perspectives in order to solve applied large-scale data analysis problems.Comment: 33 pages. To appear in Uwe Naumann and Olaf Schenk, editors,
"Combinatorial Scientific Computing," Chapman and Hall/CRC Press, 201
Communities in Networks
We survey some of the concepts, methods, and applications of community
detection, which has become an increasingly important area of network science.
To help ease newcomers into the field, we provide a guide to available
methodology and open problems, and discuss why scientists from diverse
backgrounds are interested in these problems. As a running theme, we emphasize
the connections of community detection to problems in statistical physics and
computational optimization.Comment: survey/review article on community structure in networks; published
version is available at
http://people.maths.ox.ac.uk/~porterm/papers/comnotices.pd
A Latent Parameter Node-Centric Model for Spatial Networks
Spatial networks, in which nodes and edges are embedded in space, play a
vital role in the study of complex systems. For example, many social networks
attach geo-location information to each user, allowing the study of not only
topological interactions between users, but spatial interactions as well. The
defining property of spatial networks is that edge distances are associated
with a cost, which may subtly influence the topology of the network. However,
the cost function over distance is rarely known, thus developing a model of
connections in spatial networks is a difficult task.
In this paper, we introduce a novel model for capturing the interaction
between spatial effects and network structure. Our approach represents a unique
combination of ideas from latent variable statistical models and spatial
network modeling. In contrast to previous work, we view the ability to form
long/short-distance connections to be dependent on the individual nodes
involved. For example, a node's specific surroundings (e.g. network structure
and node density) may make it more likely to form a long distance link than
other nodes with the same degree. To capture this information, we attach a
latent variable to each node which represents a node's spatial reach. These
variables are inferred from the network structure using a Markov Chain Monte
Carlo algorithm.
We experimentally evaluate our proposed model on 4 different types of
real-world spatial networks (e.g. transportation, biological, infrastructure,
and social). We apply our model to the task of link prediction and achieve up
to a 35% improvement over previous approaches in terms of the area under the
ROC curve. Additionally, we show that our model is particularly helpful for
predicting links between nodes with low degrees. In these cases, we see much
larger improvements over previous models
Comparative Evaluation of Community Detection Algorithms: A Topological Approach
Community detection is one of the most active fields in complex networks
analysis, due to its potential value in practical applications. Many works
inspired by different paradigms are devoted to the development of algorithmic
solutions allowing to reveal the network structure in such cohesive subgroups.
Comparative studies reported in the literature usually rely on a performance
measure considering the community structure as a partition (Rand Index,
Normalized Mutual information, etc.). However, this type of comparison neglects
the topological properties of the communities. In this article, we present a
comprehensive comparative study of a representative set of community detection
methods, in which we adopt both types of evaluation. Community-oriented
topological measures are used to qualify the communities and evaluate their
deviation from the reference structure. In order to mimic real-world systems,
we use artificially generated realistic networks. It turns out there is no
equivalence between both approaches: a high performance does not necessarily
correspond to correct topological properties, and vice-versa. They can
therefore be considered as complementary, and we recommend applying both of
them in order to perform a complete and accurate assessment
- …