1,130 research outputs found

    Network deconvolution as a general method to distinguish direct dependencies in networks

    Get PDF
    Recognizing direct relationships between variables connected in a network is a pervasive problem in biological, social and information sciences as correlation-based networks contain numerous indirect relationships. Here we present a general method for inferring direct effects from an observed correlation matrix containing both direct and indirect effects. We formulate the problem as the inverse of network convolution, and introduce an algorithm that removes the combined effect of all indirect paths of arbitrary length in a closed-form solution by exploiting eigen-decomposition and infinite-series sums. We demonstrate the effectiveness of our approach in several network applications: distinguishing direct targets in gene expression regulatory networks; recognizing directly interacting amino-acid residues for protein structure prediction from sequence alignments; and distinguishing strong collaborations in co-authorship social networks using connectivity information alone. In addition to its theoretical impact as a foundational graph theoretic tool, our results suggest network deconvolution is widely applicable for computing direct dependencies in network science across diverse disciplines.National Institutes of Health (U.S.) (grant R01 HG004037)National Institutes of Health (U.S.) (grant HG005639)Swiss National Science Foundation (Fellowship)National Science Foundation (U.S.) (NSF CAREER Award 0644282

    Information content based model for the topological properties of the gene regulatory network of Escherichia coli

    Full text link
    Gene regulatory networks (GRN) are being studied with increasingly precise quantitative tools and can provide a testing ground for ideas regarding the emergence and evolution of complex biological networks. We analyze the global statistical properties of the transcriptional regulatory network of the prokaryote Escherichia coli, identifying each operon with a node of the network. We propose a null model for this network using the content-based approach applied earlier to the eukaryote Saccharomyces cerevisiae. (Balcan et al., 2007) Random sequences that represent promoter regions and binding sequences are associated with the nodes. The length distributions of these sequences are extracted from the relevant databases. The network is constructed by testing for the occurrence of binding sequences within the promoter regions. The ensemble of emergent networks yields an exponentially decaying in-degree distribution and a putative power law dependence for the out-degree distribution with a flat tail, in agreement with the data. The clustering coefficient, degree-degree correlation, rich club coefficient and k-core visualization all agree qualitatively with the empirical network to an extent not yet achieved by any other computational model, to our knowledge. The significant statistical differences can point the way to further research into non-adaptive and adaptive processes in the evolution of the E. coli GRN.Comment: 58 pages, 3 tables, 22 figures. In press, Journal of Theoretical Biology (2009)

    Strategies for increasing the applicability of biological network inference

    Get PDF
    The manipulation of cellular state has many promising applications, including stem cell biology and regenerative medicine, biofuel production, and stress resistant crop development. The construction of interaction maps promises to enhance our ability to engineer cellular behavior. Within the last 15 years, many methods have been developed to infer the structure of the gene regulatory interaction map from gene abundance snapshots provided by high-throughput experimental data. However, relatively little research has focused on using gene regulatory network models for the prediction and manipulation of cellular behavior. This dissertation examines and applies strategies to utilize the predictive power of gene network models to guide experimentation and engineering efforts. First, we developed methods to improve gene network models by integrating interaction evidence sources, in order to utilize the full predictive power of the models. Next, we explored the power of networks models to guide experimental efforts through inference and analysis of a regulatory network in the pathogenic fungus Cryptococcus neoformans. Finally, we develop a novel, network-guided algorithm to select genetic interventions for engineering transcriptional state. We apply this method to select intervention strains for improving biofuel production in a mixed glucose-xylose environment. The contributions in this dissertation provide the first thorough examination, systematic application, and quantitative evaluation of the utilization of network models for guiding cellular engineering

    Network-Based Interpretation of Diverse High-Throughput Datasets through the Omics Integrator Software Package

    Get PDF
    High-throughput, ‘omic’ methods provide sensitive measures of biological responses to perturbations. However, inherent biases in high-throughput assays make it difficult to interpret experiments in which more than one type of data is collected. In this work, we introduce Omics Integrator, a software package that takes a variety of ‘omic’ data as input and identifies putative underlying molecular pathways. The approach applies advanced network optimization algorithms to a network of thousands of molecular interactions to find high-confidence, interpretable subnetworks that best explain the data. These subnetworks connect changes observed in gene expression, protein abundance or other global assays to proteins that may not have been measured in the screens due to inherent bias or noise in measurement. This approach reveals unannotated molecular pathways that would not be detectable by searching pathway databases. Omics Integrator also provides an elegant framework to incorporate not only positive data, but also negative evidence. Incorporating negative evidence allows Omics Integrator to avoid unexpressed genes and avoid being biased toward highly-studied hub proteins, except when they are strongly implicated by the data. The software is comprised of two individual tools, Garnet and Forest, that can be run together or independently to allow a user to perform advanced integration of multiple types of high-throughput data as well as create condition-specific subnetworks of protein interactions that best connect the observed changes in various datasets. It is available at http://fraenkel.mit.edu/omicsintegrator and on GitHub at https://github.com/fraenkel-lab/OmicsIntegrator.National Institutes of Health (U.S.) (grant U54CA112967)National Institutes of Health (U.S.) (grant U01CA184898)National Institutes of Health (U.S.) (grant U54NS091046)National Institutes of Health (U.S.) (grant R01GM089903

    Computational Stem Cell Biology: Open Questions and Guiding Principles

    Get PDF
    Computational biology is enabling an explosive growth in our understanding of stem cells and our ability to use them for disease modeling, regenerative medicine, and drug discovery. We discuss four topics that exemplify applications of computation to stem cell biology: cell typing, lineage tracing, trajectory inference, and regulatory networks. We use these examples to articulate principles that have guided computational biology broadly and call for renewed attention to these principles as computation becomes increasingly important in stem cell biology. We also discuss important challenges for this field with the hope that it will inspire more to join this exciting area

    Learning gene network using Bayesian network framework

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH
    corecore