658 research outputs found

    STARNET 2: a web-based tool for accelerating discovery of gene regulatory networks using microarray co-expression data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Although expression microarrays have become a standard tool used by biologists, analysis of data produced by microarray experiments may still present challenges. Comparison of data from different platforms, organisms, and labs may involve complicated data processing, and inferring relationships between genes remains difficult.</p> <p>Results</p> <p><b>S<smcaps>TAR</smcaps>N<smcaps>ET</smcaps> 2 </b>is a new web-based tool that allows post hoc visual analysis of correlations that are derived from expression microarray data. <b>S<smcaps>TAR</smcaps>N<smcaps>ET</smcaps> 2 </b>facilitates user discovery of putative gene regulatory networks in a variety of species (human, rat, mouse, chicken, zebrafish, <it>Drosophila</it>, <it>C. elegans</it>, <it>S. cerevisiae</it>, <it>Arabidopsis </it>and rice) by graphing networks of genes that are closely co-expressed across a large heterogeneous set of preselected microarray experiments. For each of the represented organisms, raw microarray data were retrieved from NCBI's Gene Expression Omnibus for a selected Affymetrix platform. All pairwise Pearson correlation coefficients were computed for expression profiles measured on each platform, respectively. These precompiled results were stored in a MySQL database, and supplemented by additional data retrieved from NCBI. A web-based tool allows user-specified queries of the database, centered at a gene of interest. The result of a query includes graphs of correlation networks, graphs of known interactions involving genes and gene products that are present in the correlation networks, and initial statistical analyses. Two analyses may be performed in parallel to compare networks, which is facilitated by the new <b>H<smcaps>EAT</smcaps>S<smcaps>EEKER </smcaps></b>module.</p> <p>Conclusion</p> <p><b>S<smcaps>TAR</smcaps>N<smcaps>ET</smcaps> 2 </b>is a useful tool for developing new hypotheses about regulatory relationships between genes and gene products, and has coverage for 10 species. Interpretation of the correlation networks is supported with a database of previously documented interactions, a test for enrichment of Gene Ontology terms, and heat maps of correlation distances that may be used to compare two networks. The list of genes in a <b>S<smcaps>TAR</smcaps>N<smcaps>ET </smcaps></b>network may be useful in developing a list of candidate genes to use for the inference of causal networks. The tool is freely available at <url>http://vanburenlab.medicine.tamhsc.edu/starnet2.html</url>, and does not require user registration.</p

    Inferring the Transcriptional Landscape of Bovine Skeletal Muscle by Integrating Co-Expression Networks

    Get PDF
    Background: Despite modern technologies and novel computational approaches, decoding causal transcriptional regulation remains challenging. This is particularly true for less well studied organisms and when only gene expression data is available. In muscle a small number of well characterised transcription factors are proposed to regulate development. Therefore, muscle appears to be a tractable system for proposing new computational approaches. Methodology/Principal Findings: Here we report a simple algorithm that asks "which transcriptional regulator has the highest average absolute co-expression correlation to the genes in a co-expression module?" It correctly infers a number of known causal regulators of fundamental biological processes, including cell cycle activity (E2F1), glycolysis (HLF), mitochondrial transcription (TFB2M), adipogenesis (PIAS1), neuronal development (TLX3), immune function (IRF1) and vasculogenesis (SOX17), within a skeletal muscle context. However, none of the canonical pro-myogenic transcription factors (MYOD1, MYOG, MYF5, MYF6 and MEF2C) were linked to muscle structural gene expression modules. Co-expression values were computed using developing bovine muscle from 60 days post conception (early foetal) to 30 months post natal (adulthood) for two breeds of cattle, in addition to a nutritional comparison with a third breed. A number of transcriptional landscapes were constructed and integrated into an always correlated landscape. One notable feature was a 'metabolic axis' formed from glycolysis genes at one end, nuclear-encoded mitochondrial protein genes at the other, and centrally tethered by mitochondrially-encoded mitochondrial protein genes. Conclusions/Significance: The new module-to-regulator algorithm complements our recently described Regulatory Impact Factor analysis. Together with a simple examination of a co-expression module's contents, these three gene expression approaches are starting to illuminate the in vivo transcriptional regulation of skeletal muscle development

    Bioinformatic analyses in early host response to Porcine Reproductive and Respiratory Syndrome virus (PRRSV) reveals pathway differences between pigs with alternate genotypes for a major host response QTL

    Get PDF
    Citation: Schroyen, M., Eisley, C., Koltes, J. E., Fritz-Waters, E., Choi, I., Plastow, G. S., . . . Tuggle, C. K. (2016). Bioinformatic analyses in early host response to Porcine Reproductive and Respiratory Syndrome virus (PRRSV) reveals pathway differences between pigs with alternate genotypes for a major host response QTL. Bmc Genomics, 17, 16. doi:10.1186/s12864-016-2547-zAdditional Authors: Tuggle, C. K.Background: A region on Sus scrofa chromosome 4 (SSC4) surrounding single nucleotide polymorphism (SNP) marker WUR10000125 (WUR) has been reported to be strongly associated with both weight gain and serum viremia in pigs after infection with PRRS virus (PRRSV). A proposed causal mutation in the guanylate binding protein 5 gene (GBP5) is predicted to truncate the encoded protein. To investigate transcriptional differences between WUR genotypes in early host response to PRRSV infection, an RNA-seq experiment was performed on globin depleted whole blood RNA collected on 0, 4, 7, 10 and 14 days post-infection (dpi) from eight littermate pairs with one AB (favorable) and one AA (unfavorable) WUR genotype animal per litter. Results: Gene Ontology (GO) enrichment analysis of transcripts that were differentially expressed (DE) between dpi across both genotypes revealed an inflammatory response for all dpi when compared to day 0. However, at the early time points of 4 and 7dpi, several GO terms had higher enrichment scores compared to later dpi, including inflammatory response (p < 10(-7)), specifically regulation of NFkappaB (p < 0.01), cytokine, and chemokine activity (p < 0.01). At 10 and 14dpi, GO term enrichment indicated a switch to DNA damage response, cell cycle checkpoints, and DNA replication. Few transcripts were DE between WUR genotypes on individual dpi or averaged over all dpi, and little enrichment of any GO term was found. However, there were differences in expression patterns over time between AA and AB animals, which was confirmed by genotype-specific expression patterns of several modules that were identified in weighted gene co-expression network analyses (WGCNA). Minor differences between AA and AB animals were observed in immune response and DNA damage response (p = 0.64 and p = 0.11, respectively), but a significant effect between genotypes pointed to a difference in ion transport/homeostasis and the participation of G-coupled protein receptors (p = 8e-4), which was reinforced by results from regulatory and phenotypic impact factor analyses between genotypes. Conclusion: We propose these pathway differences between WUR genotypes are the result of the inability of the truncated GBP5 of the AA genotyped pigs to inhibit viral entry and replication as quickly as the intact GBP5 protein of the AB genotyped pigs

    Predicting Functional and Regulatory Divergence of a Drug Resistance Transporter Gene in the Human Malaria Parasite

    Get PDF
    Background: The paradigm of resistance evolution to chemotherapeutic agents is that a key coding mutation in a specific gene drives resistance to a particular drug. In the case of resistance to the anti-malarial drug chloroquine (CQ), a specific mutation in the transporter pfcrt is associated with resistance. Here, we apply a series of analytical steps to gene expression data from our lab and leverage 3 independent datasets to identify pfcrt-interacting genes. Resulting networks provide insights into pfcrt’s biological functions and regulation, as well as the divergent phenotypic effects of its allelic variants in different genetic backgrounds. Results: To identify pfcrt-interacting genes, we analyze pfcrt co-expression networks in 2 phenotypic states - CQ-resistant (CQR) and CQ-sensitive (CQS) recombinant progeny clones - using a computational approach that prioritizes gene interactions into functional and regulatory relationships. For both phenotypic states, pfcrt co-expressed gene sets are associated with hemoglobin metabolism, consistent with CQ’s expected mode of action. To predict the drivers of co-expression divergence, we integrate topological relationships in the co-expression networks with available high confidence protein-protein interaction data. This analysis identifies 3 transcriptional regulators from the ApiAP2 family and histone acetylation as potential mediators of these divergences. We validate the predicted divergences in DNA mismatch repair and histone acetylation by measuring the effects of small molecule inhibitors in recombinant progeny clones combined with quantitative trait locus (QTL) mapping. Conclusions: This work demonstrates the utility of differential co-expression viewed in a network framework to uncover functional and regulatory divergence in phenotypically distinct parasites. pfcrt-associated co-expression in the CQ resistant progeny highlights CQR-specific gene relationships and possible targeted intervention strategies. The approaches outlined here can be readily generalized to other parasite populations and drug resistances

    A semi-parametric Bayesian model for unsupervised differential co-expression analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Differential co-expression analysis is an emerging strategy for characterizing disease related dysregulation of gene expression regulatory networks. Given pre-defined sets of biological samples, such analysis aims at identifying genes that are co-expressed in one, but not in the other set of samples.</p> <p>Results</p> <p>We developed a novel probabilistic framework for jointly uncovering contexts (i.e. groups of samples) with specific co-expression patterns, and groups of genes with different co-expression patterns across such contexts. In contrast to current clustering and bi-clustering procedures, the implicit similarity measure in this model used for grouping biological samples is based on the clustering structure of genes within each sample and not on traditional measures of gene expression level similarities. Within this framework, biological samples with widely discordant expression patterns can be placed in the same context as long as the co-clustering structure of genes is concordant within these samples. To the best of our knowledge, this is the first method to date for unsupervised differential co-expression analysis in this generality. When applied to the problem of identifying molecular subtypes of breast cancer, our method identified reproducible patterns of differential co-expression across several independent expression datasets. Sample groupings induced by these patterns were highly informative of the disease outcome. Expression patterns of differentially co-expressed genes provided new insights into the complex nature of the ER<it>α </it>regulatory network.</p> <p>Conclusions</p> <p>We demonstrated that the use of the co-clustering structure as the similarity measure in the unsupervised analysis of sample gene expression profiles provides valuable information about expression regulatory networks.</p

    Knowledge-fused differential dependency network models for detecting significant rewiring in biological networks

    Get PDF
    Modeling biological networks serves as both a major goal and an effective tool of systems biology in studying mechanisms that orchestrate the activities of gene products in cells. Biological networks are context specific and dynamic in nature. To systematically characterize the selectively activated regulatory components and mechanisms, the modeling tools must be able to effectively distinguish significant rewiring from random background fluctuations. We formulated the inference of differential dependency networks that incorporates both conditional data and prior knowledge as a convex optimization problem, and developed an efficient learning algorithm to jointly infer the conserved biological network and the significant rewiring across different conditions. We used a novel sampling scheme to estimate the expected error rate due to random knowledge and based on which, developed a strategy that fully exploits the benefit of this data-knowledge integrated approach. We demonstrated and validated the principle and performance of our method using synthetic datasets. We then applied our method to yeast cell line and breast cancer microarray data and obtained biologically plausible results.Comment: 7 pages, 7 figure

    Increased signaling entropy in cancer requires the scale-free property of protein interaction networks

    Full text link
    One of the key characteristics of cancer cells is an increased phenotypic plasticity, driven by underlying genetic and epigenetic perturbations. However, at a systems-level it is unclear how these perturbations give rise to the observed increased plasticity. Elucidating such systems-level principles is key for an improved understanding of cancer. Recently, it has been shown that signaling entropy, an overall measure of signaling pathway promiscuity, and computable from integrating a sample's gene expression profile with a protein interaction network, correlates with phenotypic plasticity and is increased in cancer compared to normal tissue. Here we develop a computational framework for studying the effects of network perturbations on signaling entropy. We demonstrate that the increased signaling entropy of cancer is driven by two factors: (i) the scale-free (or near scale-free) topology of the interaction network, and (ii) a subtle positive correlation between differential gene expression and node connectivity. Indeed, we show that if protein interaction networks were random graphs, described by Poisson degree distributions, that cancer would generally not exhibit an increased signaling entropy. In summary, this work exposes a deep connection between cancer, signaling entropy and interaction network topology.Comment: 20 pages, 5 figures. In Press in Sci Rep 201

    Partition Decoupling for Multi-gene Analysis of Gene Expression Profiling Data

    Get PDF
    We present the extention and application of a new unsupervised statistical learning technique--the Partition Decoupling Method--to gene expression data. Because it has the ability to reveal non-linear and non-convex geometries present in the data, the PDM is an improvement over typical gene expression analysis algorithms, permitting a multi-gene analysis that can reveal phenotypic differences even when the individual genes do not exhibit differential expression. Here, we apply the PDM to publicly-available gene expression data sets, and demonstrate that we are able to identify cell types and treatments with higher accuracy than is obtained through other approaches. By applying it in a pathway-by-pathway fashion, we demonstrate how the PDM may be used to find sets of mechanistically-related genes that discriminate phenotypes.Comment: Revise

    Direct Estimation of Differences in Causal Graphs

    Full text link
    We consider the problem of estimating the differences between two causal directed acyclic graph (DAG) models with a shared topological order given i.i.d. samples from each model. This is of interest for example in genomics, where changes in the structure or edge weights of the underlying causal graphs reflect alterations in the gene regulatory networks. We here provide the first provably consistent method for directly estimating the differences in a pair of causal DAGs without separately learning two possibly large and dense DAG models and computing their difference. Our two-step algorithm first uses invariance tests between regression coefficients of the two data sets to estimate the skeleton of the difference graph and then orients some of the edges using invariance tests between regression residual variances. We demonstrate the properties of our method through a simulation study and apply it to the analysis of gene expression data from ovarian cancer and during T-cell activation
    corecore