119 research outputs found

    Emergence of scale-free close-knit friendship structure in online social networks

    Get PDF
    Despite the structural properties of online social networks have attracted much attention, the properties of the close-knit friendship structures remain an important question. Here, we mainly focus on how these mesoscale structures are affected by the local and global structural properties. Analyzing the data of four large-scale online social networks reveals several common structural properties. It is found that not only the local structures given by the indegree, outdegree, and reciprocal degree distributions follow a similar scaling behavior, the mesoscale structures represented by the distributions of close-knit friendship structures also exhibit a similar scaling law. The degree correlation is very weak over a wide range of the degrees. We propose a simple directed network model that captures the observed properties. The model incorporates two mechanisms: reciprocation and preferential attachment. Through rate equation analysis of our model, the local-scale and mesoscale structural properties are derived. In the local-scale, the same scaling behavior of indegree and outdegree distributions stems from indegree and outdegree of nodes both growing as the same function of the introduction time, and the reciprocal degree distribution also shows the same power-law due to the linear relationship between the reciprocal degree and in/outdegree of nodes. In the mesoscale, the distributions of four closed triples representing close-knit friendship structures are found to exhibit identical power-laws, a behavior attributed to the negligible degree correlations. Intriguingly, all the power-law exponents of the distributions in the local-scale and mesoscale depend only on one global parameter -- the mean in/outdegree, while both the mean in/outdegree and the reciprocity together determine the ratio of the reciprocal degree of a node to its in/outdegree.Comment: 48 pages, 34 figure

    Effect of correlations on network controllability

    Get PDF
    A dynamical system is controllable if by imposing appropriate external signals on a subset of its nodes, it can be driven from any initial state to any desired state in finite time. Here we study the impact of various network characteristics on the minimal number of driver nodes required to control a network. We find that clustering and modularity have no discernible impact, but the symmetries of the underlying matching problem can produce linear, quadratic or no dependence on degree correlation coefficients, depending on the nature of the underlying correlations. The results are supported by numerical simulations and help narrow the observed gap between the predicted and the observed number of driver nodes in real networks

    Graph Metrics for Temporal Networks

    Get PDF
    Temporal networks, i.e., networks in which the interactions among a set of elementary units change over time, can be modelled in terms of time-varying graphs, which are time-ordered sequences of graphs over a set of nodes. In such graphs, the concepts of node adjacency and reachability crucially depend on the exact temporal ordering of the links. Consequently, all the concepts and metrics proposed and used for the characterisation of static complex networks have to be redefined or appropriately extended to time-varying graphs, in order to take into account the effects of time ordering on causality. In this chapter we discuss how to represent temporal networks and we review the definitions of walks, paths, connectedness and connected components valid for graphs in which the links fluctuate over time. We then focus on temporal node-node distance, and we discuss how to characterise link persistence and the temporal small-world behaviour in this class of networks. Finally, we discuss the extension of classic centrality measures, including closeness, betweenness and spectral centrality, to the case of time-varying graphs, and we review the work on temporal motifs analysis and the definition of modularity for temporal graphs.Comment: 26 pages, 5 figures, Chapter in Temporal Networks (Petter Holme and Jari Saram\"aki editors). Springer. Berlin, Heidelberg 201

    Discovering universal statistical laws of complex networks

    Full text link
    Different network models have been suggested for the topology underlying complex interactions in natural systems. These models are aimed at replicating specific statistical features encountered in real-world networks. However, it is rarely considered to which degree the results obtained for one particular network class can be extrapolated to real-world networks. We address this issue by comparing different classical and more recently developed network models with respect to their generalisation power, which we identify with large structural variability and absence of constraints imposed by the construction scheme. After having identified the most variable networks, we address the issue of which constraints are common to all network classes and are thus suitable candidates for being generic statistical laws of complex networks. In fact, we find that generic, not model-related dependencies between different network characteristics do exist. This allows, for instance, to infer global features from local ones using regression models trained on networks with high generalisation power. Our results confirm and extend previous findings regarding the synchronisation properties of neural networks. Our method seems especially relevant for large networks, which are difficult to map completely, like the neural networks in the brain. The structure of such large networks cannot be fully sampled with the present technology. Our approach provides a method to estimate global properties of under-sampled networks with good approximation. Finally, we demonstrate on three different data sets (C. elegans' neuronal network, R. prowazekii's metabolic network, and a network of synonyms extracted from Roget's Thesaurus) that real-world networks have statistical relations compatible with those obtained using regression models

    Centralized Modularity of N-Linked Glycosylation Pathways in Mammalian Cells

    Get PDF
    Glycosylation is a highly complex process to produce a diverse repertoire of cellular glycans that are attached to proteins and lipids. Glycans are involved in fundamental biological processes, including protein folding and clearance, cell proliferation and apoptosis, development, immune responses, and pathogenesis. One of the major types of glycans, N-linked glycans, is formed by sequential attachments of monosaccharides to proteins by a limited number of enzymes. Many of these enzymes can accept multiple N-linked glycans as substrates, thereby generating a large number of glycan intermediates and their intermingled pathways. Motivated by the quantitative methods developed in complex network research, we investigated the large-scale organization of such N-linked glycosylation pathways in mammalian cells. The N-linked glycosylation pathways are extremely modular, and are composed of cohesive topological modules that directly branch from a common upstream pathway of glycan synthesis. This unique structural property allows the glycan production between modules to be controlled by the upstream region. Although the enzymes act on multiple glycan substrates, indicating cross-talk between modules, the impact of the cross-talk on the module-specific enhancement of glycan synthesis may be confined within a moderate range by transcription-level control. The findings of the present study provide experimentally-testable predictions for glycosylation processes, and may be applicable to therapeutic glycoprotein engineering

    Graphs in molecular biology

    Get PDF
    Graph theoretical concepts are useful for the description and analysis of interactions and relationships in biological systems. We give a brief introduction into some of the concepts and their areas of application in molecular biology. We discuss software that is available through the Bioconductor project and present a simple example application to the integration of a protein-protein interaction and a co-expression network

    SimRank*: effective and scalable pairwise similarity search based on graph topology

    Get PDF
    Given a graph, how can we quantify similarity between two nodes in an effective and scalable way? SimRank is an attractive measure of pairwise similarity based on graph topologies. Its underpinning philosophy that “two nodes are similar if they are pointed to (have incoming edges) from similar nodes” can be regarded as an aggregation of similarities based on incoming paths. Despite its popularity in various applications (e.g., web search and social networks), SimRank has an undesirable trait, i.e., “zero-similarity”: it accommodates only the paths of equal length from a common “center” node, whereas a large portion of other paths are fully ignored. In this paper, we propose an effective and scalable similarity model, SimRank*, to remedy this problem. (1) We first provide a sufficient and necessary condition of the “zero-similarity” problem that exists in Jeh and Widom’s SimRank model, Li et al. ’s SimRank model, Random Walk with Restart (RWR), and ASCOS++. (2) We next present our treatment, SimRank*, which can resolve this issue while inheriting the merit of the simple SimRank philosophy. (3) We reduce the series form of SimRank* to a closed form, which looks simpler than SimRank but which enriches semantics without suffering from increased computational overhead. This leads to an iterative form of SimRank*, which requires O(Knm) time and O(n2) memory for computing all (n2) pairs of similarities on a graph of n nodes and m edges for K iterations. (4) To improve the computational time of SimRank* further, we leverage a novel clustering strategy via edge concentration. Due to its NP-hardness, we devise an efficient heuristic to speed up all-pairs SimRank* computation to O(Knm~) time, where m~ is generally much smaller than m. (5) To scale SimRank* on billion-edge graphs, we propose two memory-efficient single-source algorithms, i.e., ss-gSR* for geometric SimRank*, and ss-eSR* for exponential SimRank*, which can retrieve similarities between all n nodes and a given query on an as-needed basis. This significantly reduces the O(n2) memory of all-pairs search to either O(Kn+m~) for geometric SimRank*, or O(n+m~) for exponential SimRank*, without any loss of accuracy, where m~≪n2 . (6) We also compare SimRank* with another remedy of SimRank that adds self-loops on each node and demonstrate that SimRank* is more effective. (7) Using real and synthetic datasets, we empirically verify the richer semantics of SimRank*, and validate its high computational efficiency and scalability on large graphs with billions of edges

    Mesoscopic organization reveals the constraints governing C. elegans nervous system

    Get PDF
    One of the biggest challenges in biology is to understand how activity at the cellular level of neurons, as a result of their mutual interactions, leads to the observed behavior of an organism responding to a variety of environmental stimuli. Investigating the intermediate or mesoscopic level of organization in the nervous system is a vital step towards understanding how the integration of micro-level dynamics results in macro-level functioning. In this paper, we have considered the somatic nervous system of the nematode Caenorhabditis elegans, for which the entire neuronal connectivity diagram is known. We focus on the organization of the system into modules, i.e., neuronal groups having relatively higher connection density compared to that of the overall network. We show that this mesoscopic feature cannot be explained exclusively in terms of considerations, such as optimizing for resource constraints (viz., total wiring cost) and communication efficiency (i.e., network path length). Comparison with other complex networks designed for efficient transport (of signals or resources) implies that neuronal networks form a distinct class. This suggests that the principal function of the network, viz., processing of sensory information resulting in appropriate motor response, may be playing a vital role in determining the connection topology. Using modular spectral analysis, we make explicit the intimate relation between function and structure in the nervous system. This is further brought out by identifying functionally critical neurons purely on the basis of patterns of intra- and inter-modular connections. Our study reveals how the design of the nervous system reflects several constraints, including its key functional role as a processor of information.Comment: Published version, Minor modifications, 16 pages, 9 figure

    Local Difference Measures between Complex Networks for Dynamical System Model Evaluation

    Get PDF
    Acknowledgments We thank Reik V. Donner for inspiring suggestions that initialized the work presented herein. Jan H. Feldhoff is credited for providing us with the STARS simulation data and for his contributions to fruitful discussions. Comments by the anonymous reviewers are gratefully acknowledged as they led to substantial improvements of the manuscript.Peer reviewedPublisher PD

    Regularized logistic regression and multi-objective variable selection for classifying MEG data

    Get PDF
    This paper addresses the question of maximizing classifier accuracy for classifying task-related mental activity from Magnetoencelophalography (MEG) data. We propose the use of different sources of information and introduce an automatic channel selection procedure. To determine an informative set of channels, our approach combines a variety of machine learning algorithms: feature subset selection methods, classifiers based on regularized logistic regression, information fusion, and multiobjective optimization based on probabilistic modeling of the search space. The experimental results show that our proposal is able to improve classification accuracy compared to approaches whose classifiers use only one type of MEG information or for which the set of channels is fixed a priori
    corecore