13 research outputs found

    Bayesian Network Expansion Identifies New ROS and Biofilm Regulators

    Get PDF
    Signaling and regulatory pathways that guide gene expression have only been partially defined for most organisms. However, given the increasing number of microarray measurements, it may be possible to reconstruct such pathways and uncover missing connections directly from experimental data. Using a compendium of microarray gene expression data obtained from Escherichia coli, we constructed a series of Bayesian network models for the reactive oxygen species (ROS) pathway as defined by EcoCyc. A consensus Bayesian network model was generated using those networks sharing the top recovered score. This microarray-based network only partially agreed with the known ROS pathway curated from the literature and databases. A top network was then expanded to predict genes that could enhance the Bayesian network model using an algorithm we termed ‘BN+1’. This expansion procedure predicted many stress-related genes (e.g., dusB and uspE), and their possible interactions with other ROS pathway genes. A term enrichment method discovered that biofilm-associated microarray data usually contained high expression levels of both uspE and gadX. The predicted involvement of gene uspE in the ROS pathway and interactions between uspE and gadX were confirmed experimentally using E. coli reporter strains. Genes gadX and uspE showed a feedback relationship in regulating each other's expression. Both genes were verified to regulate biofilm formation through gene knockout experiments. These data suggest that the BN+1 expansion method can faithfully uncover hidden or unknown genes for a selected pathway with significant biological roles. The presently reported BN+1 expansion method is a generalized approach applicable to the characterization and expansion of other biological pathways and living systems

    Regulatory module network of basic/helix-loop-helix transcription factors in mouse brain

    Get PDF
    A comprehensive regulatory module network of 15 bHLH transcription factors over 150 target genes in mouse brain has been constructed

    USE OF APRIORI KNOWLEDGE ON DYNAMIC BAYESIAN MODELS IN TIME-COURSE EXPRESSION DATA PREDICTION

    Get PDF
    Indiana University-Purdue University Indianapolis (IUPUI)Bayesian networks, one of the most widely used techniques to understand or predict the future by making use of current or previous data, have gained credence over the last decade for their ability to simulate large gene expression datasets to track and predict the reasons for changes in biological systems. In this work, we present a dynamic Bayesian model with gene annotation scores such as the gene characterization index (GCI) and the GenCards inferred functionality score (GIFtS) to understand and assess the prediction performance of the model by incorporating prior knowledge. Time-course breast cancer data including expression data about the genes in the breast cell-lines when treated with doxorubicin is considered for this study. Bayes server software was used for the simulations in a dynamic Bayesian environment with 8 and 19 genes on 12 different data combinations for each category of gene set to predict and understand the future time- course expression profiles when annotation scores are incorporated into the model. The 8-gene set predicted the next time course with r>0.95, and the 19-gene set yielded a value of r>0.8 in 92% cases of the simulation experiments. These results showed that incorporating prior knowledge into the dynamic Bayesian model for simulating the time- course expression data can improve the prediction performance when sufficient apriori parameters are provided

    Understanding Subsystems in Biology through Dimensionality Reduction, Graph Partitioning and Analytical Modeling

    Get PDF
    Biological systems exhibit rich and complex behavior through the orchestrated interplay of a large array of components. It is hypothesized that separable subsystems with some degree of functional autonomy exist; deciphering their independent behavior and functionality would greatly facilitate understanding the system as a whole. Discovering and analyzing such subsystems are hence pivotal problems in the quest to gain a quantitative understanding of complex biological systems. In this work, using approaches from machine learning, physics and graph theory, methods for the identification and analysis of such subsystems were developed. A novel methodology, based on a recent machine learning algorithm known as non-negative matrix factorization (NMF), was developed to discover such subsystems in a set of large-scale gene expression data. This set of subsystems was then used to predict functional relationships between genes, and this approach was shown to score significantly higher than conventional methods when benchmarking them against existing databases. Moreover, a mathematical treatment was developed to treat simple network subsystems based only on their topology (independent of particular parameter values). Application to a problem of experimental interest demonstrated the need for extentions to the conventional model to fully explain the experimental data. Finally, the notion of a subsystem was evaluated from a topological perspective. A number of different protein networks were examined to analyze their topological properties with respect to separability, seeking to find separable subsystems. These networks were shown to exhibit separability in a nonintuitive fashion, while the separable subsystems were of strong biological significance. It was demonstrated that the separability property found was not due to incomplete or biased data, but is likely to reflect biological structure

    Bayesian Network Approaches for Refining and Expanding Cellular and Immunological Pathways.

    Full text link
    This thesis focuses on computational analysis of cellular and immune pathways of living cells in response to molecular signals using Bayesian networks (BN). Although Bayesian networks have been applied to the reconstruction and expansion of gene regulatory and protein signaling pathways using existing biological data, the results generated from existing BN methods show high false positive and false negative rates. To resolve these issues, two major Bayesian network approaches were developed to allow refinement and expansion of known biological pathways to identify new interactions and molecular entities participating in the pathway. How to refine existing Bayesian networks to identify the best-supported interactions predicted using underlying biological data was explored initially. A posterior probability-based EdgeClipper refinement algorithm was developed to identify well-supported interaction hypotheses in distributions of saved BNs. EdgeClipper incorporates posterior weighting to prioritize and clip interactions. This approach identified many known interactions in synthetic and Escherichia coli reactive oxygen species (ROS) pathways as well as novel interactions and improved specificity with decreasing sensitivity. Second, an expansion approach called BN+1 was introduced to identify unknown though potentially novel pathway members which likely influence biological pathways. BN+1 was applied to the expansion of several synthetic, prokaryotic, and eukaryotic pathways. Major findings included the identification of genetic interactions between genes gadX and uspE and their direct regulation of biofilm activities in E.coli, which was verified experimentally. Finally, the expansion and refinement algorithms were combined to recover a known acid fitness island and new putative acid fitness regulators using E.coli ROS pathway members, and later applied towards understanding Jak/Stat pathway regulation during human progressive kidney disease in glomerular and tubule compartments. The Jak/Stat pathway showed relatively low overlap in supported interactions for the two compartments, though recovered BN+1 genes reflected relevant biological functions and stages of disease progression for the respective kidney compartments. Overall, the results demonstrate that it is possible to refine and expand protein-level signaling pathways using transcriptional microarray data and the introduced expansion and refinement algorithms. The methods are applicable to other biological and computational systems, and are available as publicly-accessible software tools.Ph.D.BioinformaticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/89840/1/aphodges_1.pd

    The Effects and Genetic Mechanisms of Bacterial Species Interactions on Biofilm Formation.

    Full text link
    Bacteria can increase their survival in stressed environments by forming sessile biofilms on surfaces. Natural ecosystems are usually occupied by multiple species, which may interact with and therefore affect biofilm formation of an incoming species. This dissertation research explores the effects of species interactions and investigates genetic mechanisms of species interactions between an environmental strain Stenotrophomonas maltophilia and a water quality indicator species Escherichia coli on biofilm formation of E. coli. It was found that E. coli biofilm development was promoted in dynamic flow systems, but inhibited in static batch plates in mixed species culture compared with pure culture conditions. The opposite effects of co-culture on E. coli biofilm formation suggested that species interactions may have different impacts under different culture conditions. To enable the mechanistic study of species interactions, a separation method was developed to allow transcriptome analysis of mixed species communities. Transcriptomic responses of E. coli to S. maltophilia were analyzed to investigate genetic mechanisms of inhibited E. coli biofilm formation in static co-culture. Eighty-nine and 108 genes exhibited genetic responses of E. coli to S. maltophilia co-cultured in biofilm and suspensions, respectively. Several genes were involved with inhibited biofilm formation of E. coli in static co-culture. One highly up-regulated gene, fliA, was selected for a mechanistic study. It was found that the production of a major monomer of curli, CsgA, as well as cell aggregation were greatly repressed in E. coli with fliA overexpression. Knocking out fliA partially restored the inhibitive effect of co-culture on E. coli biofilm growth. Therefore, it was concluded that inhibited E. coli biofilm formation by interactions with S. maltophilia partially was caused by the induction of gene fliA to suppress curli production. Overall, this dissertation examined the effects of species interactions on biofilm formation of E. coli, highlighted the impact of environmental conditions on the effect, and revealed partial understanding of species interactions at a genetic level. This fundamental study contributes to understanding of biofilm formation in real environments with mixed species, and serves as a starting point towards the development of bacteriotherapy for pathogen control using indigenous species for environmental health.Ph.D.Environmental Health SciencesUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/78983/1/joandai_1.pd
    corecore