37,801 research outputs found

    GOexpress: an R/Bioconductor package for the identification and visualisation of robust gene ontology signatures through supervised learning of gene expression data

    Get PDF
    Background: Identification of gene expression profiles that differentiate experimental groups is critical for discovery and analysis of key molecular pathways and also for selection of robust diagnostic or prognostic biomarkers. While integration of differential expression statistics has been used to refine gene set enrichment analyses, such approaches are typically limited to single gene lists resulting from simple two-group comparisons or time-series analyses. In contrast, functional class scoring and machine learning approaches provide powerful alternative methods to leverage molecular measurements for pathway analyses, and to compare continuous and multi-level categorical factors. Results: We introduce GOexpress, a software package for scoring and summarising the capacity of gene ontology features to simultaneously classify samples from multiple experimental groups. GOexpress integrates normalised gene expression data (e.g., from microarray and RNA-seq experiments) and phenotypic information of individual samples with gene ontology annotations to derive a ranking of genes and gene ontology terms using a supervised learning approach. The default random forest algorithm allows interactions between all experimental factors, and competitive scoring of expressed genes to evaluate their relative importance in classifying predefined groups of samples. Conclusions: GOexpress enables rapid identification and visualisation of ontology-related gene panels that robustly classify groups of samples and supports both categorical (e.g., infection status, treatment) and continuous (e.g., time-series, drug concentrations) experimental factors. The use of standard Bioconductor extension packages and publicly available gene ontology annotations facilitates straightforward integration of GOexpress within existing computational biology pipelines.Department of Agriculture, Food and the MarineEuropean Commission - Seventh Framework Programme (FP7)Science Foundation IrelandUniversity College Dubli

    Robust Detection of Hierarchical Communities from Escherichia coli Gene Expression Data

    Get PDF
    Determining the functional structure of biological networks is a central goal of systems biology. One approach is to analyze gene expression data to infer a network of gene interactions on the basis of their correlated responses to environmental and genetic perturbations. The inferred network can then be analyzed to identify functional communities. However, commonly used algorithms can yield unreliable results due to experimental noise, algorithmic stochasticity, and the influence of arbitrarily chosen parameter values. Furthermore, the results obtained typically provide only a simplistic view of the network partitioned into disjoint communities and provide no information of the relationship between communities. Here, we present methods to robustly detect coregulated and functionally enriched gene communities and demonstrate their application and validity for Escherichia coli gene expression data. Applying a recently developed community detection algorithm to the network of interactions identified with the context likelihood of relatedness (CLR) method, we show that a hierarchy of network communities can be identified. These communities significantly enrich for gene ontology (GO) terms, consistent with them representing biologically meaningful groups. Further, analysis of the most significantly enriched communities identified several candidate new regulatory interactions. The robustness of our methods is demonstrated by showing that a core set of functional communities is reliably found when artificial noise, modeling experimental noise, is added to the data. We find that noise mainly acts conservatively, increasing the relatedness required for a network link to be reliably assigned and decreasing the size of the core communities, rather than causing association of genes into new communities.Comment: Due to appear in PLoS Computational Biology. Supplementary Figure S1 was not uploaded but is available by contacting the author. 27 pages, 5 figures, 15 supplementary file

    Cyclin-dependent kinases as drug targets for cell growth and proliferation disorders. A role for systems biology approach in drug development. Part II - CDKs as drug targets in hypertrophic cell growth. Modelling of drugs targeting CDKs

    Get PDF
    Cyclin-dependent kinases (CDKs) are key regulators of cell growth and proliferation. Impaired regulation of their activity leads to various diseases such as cancer and heart hypertrophy. Consequently, a number of CDKs are considered as targets for drug discovery. We review the development of inhibitors of CDK2 as anti-cancer drugs in the first part of the paper and in the second part, respectively, the development of inhibitors of CDK9 as potential therapeutics for heart hypertrophy. We argue that the above diseases are systems biology, or network diseases. In order to fully understand the complexity of the cell growth and proliferation disorders, in addition to experimental sciences, a systems biology approach, involving mathematical and computational modelling ought to be employed

    A simple and robust method for connecting small-molecule drugs using gene-expression signatures

    Get PDF
    Interaction of a drug or chemical with a biological system can result in a gene-expression profile or signature characteristic of the event. Using a suitably robust algorithm these signatures can potentially be used to connect molecules with similar pharmacological or toxicological properties. The Connectivity Map was a novel concept and innovative tool first introduced by Lamb et al to connect small molecules, genes, and diseases using genomic signatures [Lamb et al (2006), Science 313, 1929-1935]. However, the Connectivity Map had some limitations, particularly there was no effective safeguard against false connections if the observed connections were considered on an individual-by-individual basis. Further when several connections to the same small-molecule compound were viewed as a set, the implicit null hypothesis tested was not the most relevant one for the discovery of real connections. Here we propose a simple and robust method for constructing the reference gene-expression profiles and a new connection scoring scheme, which importantly allows the valuation of statistical significance of all the connections observed. We tested the new method with the two example gene-signatures (HDAC inhibitors and Estrogens) used by Lamb et al and also a new gene signature of immunosuppressive drugs. Our testing with this new method shows that it achieves a higher level of specificity and sensitivity than the original method. For example, our method successfully identified raloxifene and tamoxifen as having significant anti-estrogen effects, while Lamb et al's Connectivity Map failed to identify these. With these properties our new method has potential use in drug development for the recognition of pharmacological and toxicological properties in new drug candidates.Comment: 8 pages, 2 figures, and 2 tables; supplementary data supplied as a ZIP fil

    Consensus clustering and functional interpretation of gene-expression data

    Get PDF
    Microarray analysis using clustering algorithms can suffer from lack of inter-method consistency in assigning related gene-expression profiles to clusters. Obtaining a consensus set of clusters from a number of clustering methods should improve confidence in gene-expression analysis. Here we introduce consensus clustering, which provides such an advantage. When coupled with a statistically based gene functional analysis, our method allowed the identification of novel genes regulated by NFκB and the unfolded protein response in certain B-cell lymphomas

    Integrated signaling pathway and gene expression regulatory model to dissect dynamics of <em>Escherichia coli </em>challenged mammary epithelial cells

    Get PDF
    AbstractCells transform external stimuli, through the activation of signaling pathways, which in turn activate gene regulatory networks, in gene expression. As more omics data are generated from experiments, eliciting the integrated relationship between the external stimuli, the signaling process in the cell and the subsequent gene expression is a major challenge in systems biology. The complex system of non-linear dynamic protein interactions in signaling pathways and gene networks regulates gene expression.The complexity and non-linear aspects have resulted in the study of the signaling pathway or the gene network regulation in isolation. However, this limits the analysis of the interaction between the two components and the identification of the source of the mechanism differentiating the gene expression profiles. Here, we present a study of a model of the combined signaling pathway and gene network to highlight the importance of integrated modeling.Based on the experimental findings we developed a compartmental model and conducted several simulation experiments. The model simulates the mRNA expression of three different cytokines (RANTES, IL8 and TNFα) regulated by the transcription factor NFκB in mammary epithelial cells challenged with E. coli. The analysis of the gene network regulation identifies a lack of robustness and therefore sensitivity for the transcription factor regulation. However, analysis of the integrated signaling and gene network regulation model reveals distinctly different underlying mechanisms in the signaling pathway responsible for the variation between the three cytokine's mRNA expression levels. Our key findings reveal the importance of integrating the signaling pathway and gene expression dynamics in modeling. Modeling infers valid research questions which need to be verified experimentally and can assist in the design of future biological experiments
    corecore