319 research outputs found

    Iterative Group Analysis (iGA): A simple tool to enhance sensitivity and facilitate interpretation of microarray experiments

    Get PDF
    BACKGROUND: The biological interpretation of even a simple microarray experiment can be a challenging and highly complex task. Here we present a new method (Iterative Group Analysis) to facilitate, improve, and accelerate this process. RESULTS: Our Iterative Group Analysis approach (iGA) uses elementary statistics to identify those functional classes of genes that are significantly changed in an experiment and at the same time determines which of the class members are most likely to be differentially expressed. iGA does not require that all members of a class change and is therefore robust against imperfect class assignments, which can be derived from public sources (e.g. GeneOntologies) or automated processes (e.g. key word extraction from gene names). In contrast to previous non-iterative approaches, iGA does not depend on the availability of fixed lists of differentially expressed genes, and thus can be used to increase the sensitivity of gene detection especially in very noisy or small data sets. In the extreme, iGA can even produce statistically meaningful results without any experimental replication. The automated functional annotation provided by iGA greatly reduces the complexity of microarray results and facilitates the interpretation process. In addition, iGA can be used as a fast and efficient tool for the platform-independent comparison of a microarray experiment to the vast number of published results, automatically highlighting shared genes of potential interest. CONCLUSIONS: By applying iGA to a wide variety of data from diverse organisms and platforms we show that this approach enhances and accelerates the interpretation of microarray experiments

    From access and integration to mining of secure genomic data sets across the grid

    Get PDF
    The UK Department of Trade and Industry (DTI) funded BRIDGES project (Biomedical Research Informatics Delivered by Grid Enabled Services) has developed a Grid infrastructure to support cardiovascular research. This includes the provision of a compute Grid and a data Grid infrastructure with security at its heart. In this paper we focus on the BRIDGES data Grid. A primary aim of the BRIDGES data Grid is to help control the complexity in access to and integration of a myriad of genomic data sets through simple Grid based tools. We outline these tools, how they are delivered to the end user scientists. We also describe how these tools are to be extended in the BBSRC funded Grid Enabled Microarray Expression Profile Search (GEMEPS) to support a richer vocabulary of search capabilities to support mining of microarray data sets. As with BRIDGES, fine grain Grid security underpins GEMEPS

    Graph-based iterative Group Analysis enhances microarray interpretation

    Get PDF
    BACKGROUND: One of the most time-consuming tasks after performing a gene expression experiment is the biological interpretation of the results by identifying physiologically important associations between the differentially expressed genes. A large part of the relevant functional evidence can be represented in the form of graphs, e.g. metabolic and signaling pathways, protein interaction maps, shared GeneOntology annotations, or literature co-citation relations. Such graphs are easily constructed from available genome annotation data. The problem of biological interpretation can then be described as identifying the subgraphs showing the most significant patterns of gene expression. We applied a graph-based extension of our iterative Group Analysis (iGA) approach to obtain a statistically rigorous identification of the subgraphs of interest in any evidence graph. RESULTS: We validated the Graph-based iterative Group Analysis (GiGA) by applying it to the classic yeast diauxic shift experiment of DeRisi et al., using GeneOntology and metabolic network information. GiGA reliably identified and summarized all the biological processes discussed in the original publication. Visualization of the detected subgraphs allowed the convenient exploration of the results. The method also identified several processes that were not presented in the original paper but are of obvious relevance to the yeast starvation response. CONCLUSIONS: GiGA provides a fast and flexible delimitation of the most interesting areas in a microarray experiment, and leads to a considerable speed-up and improvement of the interpretation process

    Improving detection of differentially expressed gene sets by applying cluster enrichment analysis to Gene Ontology

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Gene set analysis based on Gene Ontology (GO) can be a promising method for the analysis of differential expression patterns. However, current studies that focus on individual GO terms have limited analytical power, because the complex structure of GO introduces strong dependencies among the terms, and some genes that are annotated to a GO term cannot be found by statistically significant enrichment.</p> <p>Results</p> <p>We proposed a method for enriching clustered GO terms based on semantic similarity, namely cluster enrichment analysis based on GO (CeaGO), to extend the individual term analysis method. Using an Affymetrix HGU95aV2 chip dataset with simulated gene sets, we illustrated that CeaGO was sensitive enough to detect moderate expression changes. When compared to parent-based individual term analysis methods, the results showed that CeaGO may provide more accurate differentiation of gene expression results. When used with two acute leukemia (ALL and ALL/AML) microarray expression datasets, CeaGO correctly identified specifically enriched GO groups that were overlooked by other individual test methods.</p> <p>Conclusion</p> <p>By applying CeaGO to both simulated and real microarray data, we showed that this approach could enhance the interpretation of microarray experiments. CeaGO is currently available at <url>http://chgc.sh.cn/en/software/CeaGO/</url>.</p

    GOrilla: a tool for discovery and visualization of enriched GO terms in ranked gene lists

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Since the inception of the GO annotation project, a variety of tools have been developed that support exploring and searching the GO database. In particular, a variety of tools that perform GO enrichment analysis are currently available. Most of these tools require as input a target set of genes and a background set and seek enrichment in the target set compared to the background set. A few tools also exist that support analyzing ranked lists. The latter typically rely on simulations or on union-bound correction for assigning statistical significance to the results.</p> <p>Results</p> <p><it>GOrilla </it>is a web-based application that identifies enriched GO terms in ranked lists of genes, without requiring the user to provide explicit target and background sets. This is particularly useful in many typical cases where genomic data may be naturally represented as a ranked list of genes (e.g. by level of expression or of differential expression). <it>GOrilla </it>employs a flexible threshold statistical approach to discover GO terms that are significantly enriched at the <it>top </it>of a ranked gene list. Building on a complete theoretical characterization of the underlying distribution, called mHG, <it>GOrilla </it>computes an exact p-value for the observed enrichment, taking threshold multiple testing into account without the need for simulations. This enables rigorous statistical analysis of thousand of genes and thousands of GO terms in order of seconds. The output of the enrichment analysis is visualized as a hierarchical structure, providing a clear view of the relations between enriched GO terms.</p> <p>Conclusion</p> <p><it>GOrilla </it>is an efficient GO analysis tool with unique features that make a useful addition to the existing repertoire of GO enrichment tools. <it>GOrilla</it>'s unique features and advantages over other threshold free enrichment tools include rigorous statistics, fast running time and an effective graphical representation. <it>GOrilla </it>is publicly available at: <url>http://cbl-gorilla.cs.technion.ac.il</url></p

    Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists

    Get PDF
    Functional analysis of large gene lists, derived in most cases from emerging high-throughput genomic, proteomic and bioinformatics scanning approaches, is still a challenging and daunting task. The gene-annotation enrichment analysis is a promising high-throughput strategy that increases the likelihood for investigators to identify biological processes most pertinent to their study. Approximately 68 bioinformatics enrichment tools that are currently available in the community are collected in this survey. Tools are uniquely categorized into three major classes, according to their underlying enrichment algorithms. The comprehensive collections, unique tool classifications and associated questions/issues will provide a more comprehensive and up-to-date view regarding the advantages, pitfalls and recent trends in a simpler tool-class level rather than by a tool-by-tool approach. Thus, the survey will help tool designers/developers and experienced end users understand the underlying algorithms and pertinent details of particular tool categories/tools, enabling them to make the best choices for their particular research interests

    Ten Years of Pathway Analysis: Current Approaches and Outstanding Challenges

    Get PDF
    Pathway analysis has become the first choice for gaining insight into the underlying biology of differentially expressed genes and proteins, as it reduces complexity and has increased explanatory power. We discuss the evolution of knowledge base–driven pathway analysis over its first decade, distinctly divided into three generations. We also discuss the limitations that are specific to each generation, and how they are addressed by successive generations of methods. We identify a number of annotation challenges that must be addressed to enable development of the next generation of pathway analysis methods. Furthermore, we identify a number of methodological challenges that the next generation of methods must tackle to take advantage of the technological advances in genomics and proteomics in order to improve specificity, sensitivity, and relevance of pathway analysis

    Genes of cell-cell interactions, chemotherapy detoxification and apoptosis are induced during chemotherapy of acute myeloid leukemia

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The molecular changes <it>in vivo </it>in acute myeloid leukemia cells early after start of conventional genotoxic chemotherapy are incompletely understood, and it is not known if early molecular modulations reflect clinical response.</p> <p>Methods</p> <p>The gene expression was examined by whole genome 44 k oligo microarrays and 12 k cDNA microarrays in peripheral blood leukocytes collected from seven leukemia patients before treatment, 2–4 h and 18–24 h after start of chemotherapy and validated by real-time quantitative PCR. Statistically significantly upregulated genes were classified using gene ontology (GO) terms. Parallel samples were examined by flow cytometry for apoptosis by annexin V-binding and the expression of selected proteins were confirmed by immunoblotting.</p> <p>Results</p> <p>Significant differential modulation of 151 genes were found at 4 h after start of induction therapy with cytarabine and anthracycline, including significant overexpression of 31 genes associated with p53 regulation. Within 4 h of chemotherapy the BCL2/BAX and BCL2/PUMA ratio were attenuated in proapoptotic direction. FLT3 mutations indicated that non-responders (5/7 patients, 8 versus 49 months survival) are characterized by a unique gene response profile before and at 4 h. At 18–24 h after chemotherapy, the gene expression of p53 target genes was attenuated, while genes involved in chemoresistance, cytarabine detoxification, chemokine networks and T cell receptor were prominent. No signs of apoptosis were observed in the collected cells, suggesting the treated patients as a physiological source of pre-apoptotic cells.</p> <p>Conclusion</p> <p>Pre-apoptotic gene expression can be monitored within hours after start of chemotherapy in patients with acute myeloid leukemia, and may be useful in future determination of therapy responders. The low number of patients and the heterogeneity of acute myeloid leukemia limited the identification of gene expression predictive of therapy response. Therapy-induced gene expression reflects the complex biological processes involved in clinical cancer cell eradication and should be explored for future enhancement of therapy.</p

    An Integrated Approach for the Analysis of Biological Pathways using Mixed Models

    Get PDF
    Gene class, ontology, or pathway testing analysis has become increasingly popular in microarray data analysis. Such approaches allow the integration of gene annotation databases, such as Gene Ontology and KEGG Pathway, to formally test for subtle but coordinated changes at a system level. Higher power in gene class testing is gained by combining weak signals from a number of individual genes in each pathway. We propose an alternative approach for gene-class testing based on mixed models, a class of statistical models that