119 research outputs found
Emergence of scale-free close-knit friendship structure in online social networks
Despite the structural properties of online social networks have attracted
much attention, the properties of the close-knit friendship structures remain
an important question. Here, we mainly focus on how these mesoscale structures
are affected by the local and global structural properties. Analyzing the data
of four large-scale online social networks reveals several common structural
properties. It is found that not only the local structures given by the
indegree, outdegree, and reciprocal degree distributions follow a similar
scaling behavior, the mesoscale structures represented by the distributions of
close-knit friendship structures also exhibit a similar scaling law. The degree
correlation is very weak over a wide range of the degrees. We propose a simple
directed network model that captures the observed properties. The model
incorporates two mechanisms: reciprocation and preferential attachment. Through
rate equation analysis of our model, the local-scale and mesoscale structural
properties are derived. In the local-scale, the same scaling behavior of
indegree and outdegree distributions stems from indegree and outdegree of nodes
both growing as the same function of the introduction time, and the reciprocal
degree distribution also shows the same power-law due to the linear
relationship between the reciprocal degree and in/outdegree of nodes. In the
mesoscale, the distributions of four closed triples representing close-knit
friendship structures are found to exhibit identical power-laws, a behavior
attributed to the negligible degree correlations. Intriguingly, all the
power-law exponents of the distributions in the local-scale and mesoscale
depend only on one global parameter -- the mean in/outdegree, while both the
mean in/outdegree and the reciprocity together determine the ratio of the
reciprocal degree of a node to its in/outdegree.Comment: 48 pages, 34 figure
Effect of correlations on network controllability
A dynamical system is controllable if by imposing appropriate external
signals on a subset of its nodes, it can be driven from any initial state to
any desired state in finite time. Here we study the impact of various network
characteristics on the minimal number of driver nodes required to control a
network. We find that clustering and modularity have no discernible impact, but
the symmetries of the underlying matching problem can produce linear, quadratic
or no dependence on degree correlation coefficients, depending on the nature of
the underlying correlations. The results are supported by numerical simulations
and help narrow the observed gap between the predicted and the observed number
of driver nodes in real networks
Graph Metrics for Temporal Networks
Temporal networks, i.e., networks in which the interactions among a set of
elementary units change over time, can be modelled in terms of time-varying
graphs, which are time-ordered sequences of graphs over a set of nodes. In such
graphs, the concepts of node adjacency and reachability crucially depend on the
exact temporal ordering of the links. Consequently, all the concepts and
metrics proposed and used for the characterisation of static complex networks
have to be redefined or appropriately extended to time-varying graphs, in order
to take into account the effects of time ordering on causality. In this chapter
we discuss how to represent temporal networks and we review the definitions of
walks, paths, connectedness and connected components valid for graphs in which
the links fluctuate over time. We then focus on temporal node-node distance,
and we discuss how to characterise link persistence and the temporal
small-world behaviour in this class of networks. Finally, we discuss the
extension of classic centrality measures, including closeness, betweenness and
spectral centrality, to the case of time-varying graphs, and we review the work
on temporal motifs analysis and the definition of modularity for temporal
graphs.Comment: 26 pages, 5 figures, Chapter in Temporal Networks (Petter Holme and
Jari Saram\"aki editors). Springer. Berlin, Heidelberg 201
Discovering universal statistical laws of complex networks
Different network models have been suggested for the topology underlying
complex interactions in natural systems. These models are aimed at replicating
specific statistical features encountered in real-world networks. However, it
is rarely considered to which degree the results obtained for one particular
network class can be extrapolated to real-world networks. We address this issue
by comparing different classical and more recently developed network models
with respect to their generalisation power, which we identify with large
structural variability and absence of constraints imposed by the construction
scheme. After having identified the most variable networks, we address the
issue of which constraints are common to all network classes and are thus
suitable candidates for being generic statistical laws of complex networks. In
fact, we find that generic, not model-related dependencies between different
network characteristics do exist. This allows, for instance, to infer global
features from local ones using regression models trained on networks with high
generalisation power. Our results confirm and extend previous findings
regarding the synchronisation properties of neural networks. Our method seems
especially relevant for large networks, which are difficult to map completely,
like the neural networks in the brain. The structure of such large networks
cannot be fully sampled with the present technology. Our approach provides a
method to estimate global properties of under-sampled networks with good
approximation. Finally, we demonstrate on three different data sets (C.
elegans' neuronal network, R. prowazekii's metabolic network, and a network of
synonyms extracted from Roget's Thesaurus) that real-world networks have
statistical relations compatible with those obtained using regression models
Centralized Modularity of N-Linked Glycosylation Pathways in Mammalian Cells
Glycosylation is a highly complex process to produce a diverse repertoire of
cellular glycans that are attached to proteins and lipids. Glycans are involved
in fundamental biological processes, including protein folding and clearance,
cell proliferation and apoptosis, development, immune responses, and
pathogenesis. One of the major types of glycans, N-linked glycans, is formed by
sequential attachments of monosaccharides to proteins by a limited number of
enzymes. Many of these enzymes can accept multiple N-linked glycans as
substrates, thereby generating a large number of glycan intermediates and their
intermingled pathways. Motivated by the quantitative methods developed in
complex network research, we investigated the large-scale organization of such
N-linked glycosylation pathways in mammalian cells. The N-linked glycosylation
pathways are extremely modular, and are composed of cohesive topological
modules that directly branch from a common upstream pathway of glycan
synthesis. This unique structural property allows the glycan production between
modules to be controlled by the upstream region. Although the enzymes act on
multiple glycan substrates, indicating cross-talk between modules, the impact
of the cross-talk on the module-specific enhancement of glycan synthesis may be
confined within a moderate range by transcription-level control. The findings
of the present study provide experimentally-testable predictions for
glycosylation processes, and may be applicable to therapeutic glycoprotein
engineering
Graphs in molecular biology
Graph theoretical concepts are useful for the description and analysis of interactions and relationships in biological systems. We give a brief introduction into some of the concepts and their areas of application in molecular biology. We discuss software that is available through the Bioconductor project and present a simple example application to the integration of a protein-protein interaction and a co-expression network
SimRank*: effective and scalable pairwise similarity search based on graph topology
Given a graph, how can we quantify similarity between two nodes in an effective and scalable way? SimRank is an attractive measure of pairwise similarity based on graph topologies. Its underpinning philosophy that “two nodes are similar if they are pointed to (have incoming edges) from similar nodes” can be regarded as an aggregation of similarities based on incoming paths. Despite its popularity in various applications (e.g., web search and social networks), SimRank has an undesirable trait, i.e., “zero-similarity”: it accommodates only the paths of equal length from a common “center” node, whereas a large portion of other paths are fully ignored. In this paper, we propose an effective and scalable similarity model, SimRank*, to remedy this problem. (1) We first provide a sufficient and necessary condition of the “zero-similarity” problem that exists in Jeh and Widom’s SimRank model, Li et al. ’s SimRank model, Random Walk with Restart (RWR), and ASCOS++. (2) We next present our treatment, SimRank*, which can resolve this issue while inheriting the merit of the simple SimRank philosophy. (3) We reduce the series form of SimRank* to a closed form, which looks simpler than SimRank but which enriches semantics without suffering from increased computational overhead. This leads to an iterative form of SimRank*, which requires O(Knm) time and O(n2) memory for computing all (n2) pairs of similarities on a graph of n nodes and m edges for K iterations. (4) To improve the computational time of SimRank* further, we leverage a novel clustering strategy via edge concentration. Due to its NP-hardness, we devise an efficient heuristic to speed up all-pairs SimRank* computation to O(Knm~) time, where m~ is generally much smaller than m. (5) To scale SimRank* on billion-edge graphs, we propose two memory-efficient single-source algorithms, i.e., ss-gSR* for geometric SimRank*, and ss-eSR* for exponential SimRank*, which can retrieve similarities between all n nodes and a given query on an as-needed basis. This significantly reduces the O(n2) memory of all-pairs search to either O(Kn+m~) for geometric SimRank*, or O(n+m~) for exponential SimRank*, without any loss of accuracy, where m~≪n2 . (6) We also compare SimRank* with another remedy of SimRank that adds self-loops on each node and demonstrate that SimRank* is more effective. (7) Using real and synthetic datasets, we empirically verify the richer semantics of SimRank*, and validate its high computational efficiency and scalability on large graphs with billions of edges
Mesoscopic organization reveals the constraints governing C. elegans nervous system
One of the biggest challenges in biology is to understand how activity at the
cellular level of neurons, as a result of their mutual interactions, leads to
the observed behavior of an organism responding to a variety of environmental
stimuli. Investigating the intermediate or mesoscopic level of organization in
the nervous system is a vital step towards understanding how the integration of
micro-level dynamics results in macro-level functioning. In this paper, we have
considered the somatic nervous system of the nematode Caenorhabditis elegans,
for which the entire neuronal connectivity diagram is known. We focus on the
organization of the system into modules, i.e., neuronal groups having
relatively higher connection density compared to that of the overall network.
We show that this mesoscopic feature cannot be explained exclusively in terms
of considerations, such as optimizing for resource constraints (viz., total
wiring cost) and communication efficiency (i.e., network path length).
Comparison with other complex networks designed for efficient transport (of
signals or resources) implies that neuronal networks form a distinct class.
This suggests that the principal function of the network, viz., processing of
sensory information resulting in appropriate motor response, may be playing a
vital role in determining the connection topology. Using modular spectral
analysis, we make explicit the intimate relation between function and structure
in the nervous system. This is further brought out by identifying functionally
critical neurons purely on the basis of patterns of intra- and inter-modular
connections. Our study reveals how the design of the nervous system reflects
several constraints, including its key functional role as a processor of
information.Comment: Published version, Minor modifications, 16 pages, 9 figure
Local Difference Measures between Complex Networks for Dynamical System Model Evaluation
Acknowledgments We thank Reik V. Donner for inspiring suggestions that initialized the work presented herein. Jan H. Feldhoff is credited for providing us with the STARS simulation data and for his contributions to fruitful discussions. Comments by the anonymous reviewers are gratefully acknowledged as they led to substantial improvements of the manuscript.Peer reviewedPublisher PD
Regularized logistic regression and multi-objective variable selection for classifying MEG data
This paper addresses the question of maximizing classifier accuracy for classifying task-related mental activity from Magnetoencelophalography (MEG) data. We propose the use of different sources of information and introduce an automatic channel selection procedure. To determine an informative set of channels, our approach combines a variety of machine learning algorithms: feature subset selection methods, classifiers based on regularized logistic regression, information fusion, and multiobjective optimization based on probabilistic modeling of the search space. The experimental results show that our proposal is able to improve classification accuracy compared to approaches whose classifiers use only one type of MEG information or for which the set of channels is fixed a priori
- …