67 research outputs found

    Subgraphs in random networks

    Full text link
    Understanding the subgraph distribution in random networks is important for modelling complex systems. In classic Erdos networks, which exhibit a Poissonian degree distribution, the number of appearances of a subgraph G with n nodes and g edges scales with network size as \mean{G} ~ N^{n-g}. However, many natural networks have a non-Poissonian degree distribution. Here we present approximate equations for the average number of subgraphs in an ensemble of random sparse directed networks, characterized by an arbitrary degree sequence. We find new scaling rules for the commonly occurring case of directed scale-free networks, in which the outgoing degree distribution scales as P(k) ~ k^{-\gamma}. Considering the power exponent of the degree distribution, \gamma, as a control parameter, we show that random networks exhibit transitions between three regimes. In each regime the subgraph number of appearances follows a different scaling law, \mean{G} ~ N^{\alpha}, where \alpha=n-g+s-1 for \gamma<2, \alpha=n-g+s+1-\gamma for 2<\gamma<\gamma_c, and \alpha=n-g for \gamma>\gamma_c, s is the maximal outdegree in the subgraph, and \gamma_c=s+1. We find that certain subgraphs appear much more frequently than in Erdos networks. These results are in very good agreement with numerical simulations. This has implications for detecting network motifs, subgraphs that occur in natural networks significantly more than in their randomized counterparts.Comment: 8 pages, 5 figure

    The topological relationship between the large-scale attributes and local interaction patterns of complex networks

    Full text link
    Recent evidence indicates that the abundance of recurring elementary interaction patterns in complex networks, often called subgraphs or motifs, carry significant information about their function and overall organization. Yet, the underlying reasons for the variable quantity of different subgraph types, their propensity to form clusters, and their relationship with the networks' global organization remain poorly understood. Here we show that a network's large-scale topological organization and its local subgraph structure mutually define and predict each other, as confirmed by direct measurements in five well studied cellular networks. We also demonstrate the inherent existence of two distinct classes of subgraphs, and show that, in contrast to the low-density type II subgraphs, the highly abundant type I subgraphs cannot exist in isolation but must naturally aggregate into subgraph clusters. The identified topological framework may have important implications for our understanding of the origin and function of subgraphs in all complex networks.Comment: pape

    Signatures of small-world and scale-free properties in large computer programs

    Full text link
    A large computer program is typically divided into many hundreds or even thousands of smaller units, whose logical connections define a network in a natural way. This network reflects the internal structure of the program, and defines the ``information flow'' within the program. We show that, (1) due to its growth in time this network displays a scale-free feature in that the probability of the number of links at a node obeys a power-law distribution, and (2) as a result of performance optimization of the program the network has a small-world structure. We believe that these features are generic for large computer programs. Our work extends the previous studies on growing networks, which have mostly been for physical networks, to the domain of computer software.Comment: 4 pages, 1 figure, to appear in Phys. Rev.

    Inferring the role of transcription factors in regulatory networks

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Expression profiles obtained from multiple perturbation experiments are increasingly used to reconstruct transcriptional regulatory networks, from well studied, simple organisms up to higher eukaryotes. Admittedly, a key ingredient in developing a reconstruction method is its ability to integrate heterogeneous sources of information, as well as to comply with practical observability issues: measurements can be scarce or noisy. In this work, we show how to combine a network of genetic regulations with a set of expression profiles, in order to infer the functional effect of the regulations, as inducer or repressor. Our approach is based on a consistency rule between a network and the signs of variation given by expression arrays.</p> <p>Results</p> <p>We evaluate our approach in several settings of increasing complexity. First, we generate artificial expression data on a transcriptional network of <it>E. coli </it>extracted from the literature (1529 nodes and 3802 edges), and we estimate that 30% of the regulations can be annotated with about 30 profiles. We additionally prove that at most 40.8% of the network can be inferred using our approach. Second, we use this network in order to validate the predictions obtained with a compendium of real expression profiles. We describe a filtering algorithm that generates particularly reliable predictions. Finally, we apply our inference approach to <it>S. cerevisiae </it>transcriptional network (2419 nodes and 4344 interactions), by combining ChIP-chip data and 15 expression profiles. We are able to detect and isolate inconsistencies between the expression profiles and a significant portion of the model (15% of all the interactions). In addition, we report predictions for 14.5% of all interactions.</p> <p>Conclusion</p> <p>Our approach does not require accurate expression levels nor times series. Nevertheless, we show on both data, real and artificial, that a relatively small number of perturbation experiments are enough to determine a significant portion of regulatory effects. This is a key practical asset compared to statistical methods for network reconstruction. We demonstrate that our approach is able to provide accurate predictions, even when the network is incomplete and the data is noisy.</p

    Common and Unique Network Dynamics in Football Games

    Get PDF
    The sport of football is played between two teams of eleven players each using a spherical ball. Each team strives to score by driving the ball into the opposing goal as the result of skillful interactions among players. Football can be regarded from the network perspective as a competitive relationship between two cooperative networks with a dynamic network topology and dynamic network node. Many complex large-scale networks have been shown to have topological properties in common, based on a small-world network and scale-free network models. However, the human dynamic movement pattern of this network has never been investigated in a real-world setting. Here, we show that the power law in degree distribution emerged in the passing behavior in the 2006 FIFA World Cup Final and an international “A” match in Japan, by describing players as vertices connected by links representing passes. The exponent values are similar to the typical values that occur in many real-world networks, which are in the range of , and are larger than that of a gene transcription network, . Furthermore, we reveal the stochastically switched dynamics of the hub player throughout the game as a unique feature in football games. It suggests that this feature could result not only in securing vulnerability against intentional attack, but also in a power law for self-organization. Our results suggest common and unique network dynamics of two competitive networks, compared with the large-scale networks that have previously been investigated in numerous works. Our findings may lead to improved resilience and survivability not only in biological networks, but also in communication networks

    The Carbon Assimilation Network in Escherichia coli Is Densely Connected and Largely Sign-Determined by Directions of Metabolic Fluxes

    Get PDF
    Gene regulatory networks consist of direct interactions but also include indirect interactions mediated by metabolites and signaling molecules. We describe how these indirect interactions can be derived from a model of the underlying biochemical reaction network, using weak time-scale assumptions in combination with sensitivity criteria from metabolic control analysis. We apply this approach to a model of the carbon assimilation network in Escherichia coli. Our results show that the derived gene regulatory network is densely connected, contrary to what is usually assumed. Moreover, the network is largely sign-determined, meaning that the signs of the indirect interactions are fixed by the flux directions of biochemical reactions, independently of specific parameter values and rate laws. An inversion of the fluxes following a change in growth conditions may affect the signs of the indirect interactions though. This leads to a feedback structure that is at the same time robust to changes in the kinetic properties of enzymes and that has the flexibility to accommodate radical changes in the environment

    Analysis of Combinatorial Regulation: Scaling of Partnerships between Regulators with the Number of Governed Targets

    Get PDF
    Through combinatorial regulation, regulators partner with each other to control common targets and this allows a small number of regulators to govern many targets. One interesting question is that given this combinatorial regulation, how does the number of regulators scale with the number of targets? Here, we address this question by building and analyzing co-regulation (co-transcription and co-phosphorylation) networks that describe partnerships between regulators controlling common genes. We carry out analyses across five diverse species: Escherichia coli to human. These reveal many properties of partnership networks, such as the absence of a classical power-law degree distribution despite the existence of nodes with many partners. We also find that the number of co-regulatory partnerships follows an exponential saturation curve in relation to the number of targets. (For E. coli and Bacillus subtilis, only the beginning linear part of this curve is evident due to arrangement of genes into operons.) To gain intuition into the saturation process, we relate the biological regulation to more commonplace social contexts where a small number of individuals can form an intricate web of connections on the internet. Indeed, we find that the size of partnership networks saturates even as the complexity of their output increases. We also present a variety of models to account for the saturation phenomenon. In particular, we develop a simple analytical model to show how new partnerships are acquired with an increasing number of target genes; with certain assumptions, it reproduces the observed saturation. Then, we build a more general simulation of network growth and find agreement with a wide range of real networks. Finally, we perform various down-sampling calculations on the observed data to illustrate the robustness of our conclusions

    The Information Coded in the Yeast Response Elements Accounts for Most of the Topological Properties of Its Transcriptional Regulation Network

    Get PDF
    The regulation of gene expression in a cell relies to a major extent on transcription factors, proteins which recognize and bind the DNA at specific binding sites (response elements) within promoter regions associated with each gene. We present an information theoretic approach to modeling transcriptional regulatory networks, in terms of a simple “sequence-matching” rule and the statistics of the occurrence of binding sequences of given specificity in random promoter regions. The crucial biological input is the distribution of the amount of information coded in these cognate response elements and the length distribution of the promoter regions. We provide an analysis of the transcriptional regulatory network of yeast Saccharomyces cerevisiae, which we extract from the available databases, with respect to the degree distributions, clustering coefficient, degree correlations, rich-club coefficient and the k-core structure. We find that these topological features are in remarkable agreement with those predicted by our model, on the basis of the amount of information coded in the interaction between the transcription factors and response elements

    Towards the prediction of essential genes by integration of network topology, cellular localization and biological process information

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The identification of essential genes is important for the understanding of the minimal requirements for cellular life and for practical purposes, such as drug design. However, the experimental techniques for essential genes discovery are labor-intensive and time-consuming. Considering these experimental constraints, a computational approach capable of accurately predicting essential genes would be of great value. We therefore present here a machine learning-based computational approach relying on network topological features, cellular localization and biological process information for prediction of essential genes.</p> <p>Results</p> <p>We constructed a decision tree-based meta-classifier and trained it on datasets with individual and grouped attributes-network topological features, cellular compartments and biological processes-to generate various predictors of essential genes. We showed that the predictors with better performances are those generated by datasets with integrated attributes. Using the predictor with all attributes, i.e., network topological features, cellular compartments and biological processes, we obtained the best predictor of essential genes that was then used to classify yeast genes with unknown essentiality status. Finally, we generated decision trees by training the J48 algorithm on datasets with all network topological features, cellular localization and biological process information to discover cellular rules for essentiality. We found that the number of protein physical interactions, the nuclear localization of proteins and the number of regulating transcription factors are the most important factors determining gene essentiality.</p> <p>Conclusion</p> <p>We were able to demonstrate that network topological features, cellular localization and biological process information are reliable predictors of essential genes. Moreover, by constructing decision trees based on these data, we could discover cellular rules governing essentiality.</p
    corecore