1,093 research outputs found

    GFINDer: Genome Function INtegrated Discoverer through dynamic annotation, statistical analysis, and mining

    Get PDF
    Statisticalandclustering analyses ofgeneexpression results from high-density microarray experiments produce lists of hundreds of genes regulated differentially, or with particular expression profiles, in the conditions under study. Independent of the microarray platforms and analysis methods used, these lists must be biologically interpreted to gain a better knowledge of the patho-physiological phenomena involved. To this end, numerous biological annotations are available within heterogeneous and widely distributed databases. Although several tools have been developed for annotating lists of genes, most of them do not give methods for evaluating the relevance of the annotations provided, or for estimating the functional bias introduced by the gene set on the array used to identify the gene list considered. We developed Genome Functional INtegrated Discoverer (GFINDer ), a web server able to automatically provide large-scale lists of user-classified genes with functional profiles biologically characterizing the different gene classes in the list. GFINDer automatically retrieves annotations of several functional categories from different sources, identifies the categories enriched in each class of a user-classified gene list and calculates statistical significance values for each category. Moreover, GFINDer enables the functional classification of genes according to mined functional categories and the statistical analysis is of the classifications obtained, aiding better interpretationof microarray experiment results. GFINDer is available online at http://www.medinfopoli.polimi.it/GFINDer/

    Multiple tests of association with biological annotation metadata

    Full text link
    We propose a general and formal statistical framework for multiple tests of association between known fixed features of a genome and unknown parameters of the distribution of variable features of this genome in a population of interest. The known gene-annotation profiles, corresponding to the fixed features of the genome, may concern Gene Ontology (GO) annotation, pathway membership, regulation by particular transcription factors, nucleotide sequences, or protein sequences. The unknown gene-parameter profiles, corresponding to the variable features of the genome, may be, for example, regression coefficients relating possibly censored biological and clinical outcomes to genome-wide transcript levels, DNA copy numbers, and other covariates. A generic question of great interest in current genomic research regards the detection of associations between biological annotation metadata and genome-wide expression measures. This biological question may be translated as the test of multiple hypotheses concerning association measures between gene-annotation profiles and gene-parameter profiles. A general and rigorous formulation of the statistical inference question allows us to apply the multiple hypothesis testing methodology developed in [Multiple Testing Procedures with Applications to Genomics (2008) Springer, New York] and related articles, to control a broad class of Type I error rates, defined as generalized tail probabilities and expected values for arbitrary functions of the numbers of Type I errors and rejected hypotheses. The resampling-based single-step and stepwise multiple testing procedures of [Multiple Testing Procedures with Applications to Genomics (2008) Springer, New York] take into account the joint distribution of the test statistics and provide Type I error control in testing problems involving general data generating distributions (with arbitrary dependence structures among variables), null hypotheses, and test statistics.Comment: Published in at http://dx.doi.org/10.1214/193940307000000446 the IMS Collections (http://www.imstat.org/publications/imscollections.htm) by the Institute of Mathematical Statistics (http://www.imstat.org

    Integrative computational biology for cancer research

    Get PDF
    Over the past two decades, high-throughput (HTP) technologies such as microarrays and mass spectrometry have fundamentally changed clinical cancer research. They have revealed novel molecular markers of cancer subtypes, metastasis, and drug sensitivity and resistance. Some have been translated into the clinic as tools for early disease diagnosis, prognosis, and individualized treatment and response monitoring. Despite these successes, many challenges remain: HTP platforms are often noisy and suffer from false positives and false negatives; optimal analysis and successful validation require complex workflows; and great volumes of data are accumulating at a rapid pace. Here we discuss these challenges, and show how integrative computational biology can help diminish them by creating new software tools, analytical methods, and data standards

    PolymiRTS Database: linking polymorphisms in microRNA target sites with complex traits

    Get PDF
    Polymorphism in microRNA Target Site (PolymiRTS) database is a collection of naturally occurring DNA variations in putative microRNA target sites. PolymiRTSs may affect gene expression and cause variations in complex phenotypes. The database integrates sequence polymorphism, phenotype and expression microarray data, and characterizes PolymiRTSs as potential candidates responsible for the quantitative trait locus (QTL) effects. It is a resource for studying PolymiRTSs and their implications in phenotypic variations. PolymiRTS database can be accessed at

    BABELOMICS: a systems biology perspective in the functional annotation of genome-scale experiments

    Get PDF
    We present a new version of Babelomics, a complete suite of web tools for functional analysis of genome-scale experiments, with new and improved tools. New functionally relevant terms have been included such as CisRed motifs or bioentities obtained by text-mining procedures. An improved indexing has considerably speeded up several of the modules. An improved version of the FatiScan method for studying the coordinate behaviour of groups of functionally related genes is presented, along with a similar tool, the Gene Set Enrichment Analysis. Babelomics is now more oriented to test systems biology inspired hypotheses. Babelomics can be found at

    Answering biological questions: querying a systems biology database for nutrigenomics

    Get PDF
    The requirement of systems biology for connecting different levels of biological research leads directly to a need for integrating vast amounts of diverse information in general and of omics data in particular. The nutritional phenotype database addresses this challenge for nutrigenomics. A particularly urgent objective in coping with the data avalanche is making biologically meaningful information accessible to the researcher. This contribution describes how we intend to meet this objective with the nutritional phenotype database. We outline relevant parts of the system architecture, describe the kinds of data managed by it, and show how the system can support retrieval of biologically meaningful information by means of ontologies, full-text queries, and structured queries. Our contribution points out critical points, describes several technical hurdles. It demonstrates how pathway analysis can improve queries and comparisons for nutrition studies. Finally, three directions for future research are given

    iGepros: an integrated gene and protein annotation server for biological nature exploration

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In the post-genomic era, transcriptomics and proteomics provide important information to understand the genomes. With fast development of high-throughput technology, more and more transcriptomics and proteomics data are generated at an unprecedented rate. Therefore, requirement of software to annotate those omics data and explore their biological nature arises. In the past decade, some pioneer works were presented to address this issue, but limitations still exist. Fox example, some of these tools offer command line only, which is not suitable for those users with little or no experience in programming. Besides, some tools don’t support large scale gene and protein analysis.</p> <p>Results</p> <p>To overcome these limitations, an integrated gene and protein annotation server named iGepros has been developed. The server provides user-friendly interfaces and detailed on-line examples, so most researchers even those with little or no programming experience can use it smoothly. Moreover, the server provides many functionalities to compare transcriptomics and proteomics data. Especially, the server is constructed under a model-view-control framework, which makes it easy to incorporate more functions to the server in the future.</p> <p>Conclusions</p> <p>In this paper, we present a server with powerful capability not only for gene and protein functional annotation, but also for transcriptomics and proteomics data comparison. Researchers can survey biological characters behind gene and protein datasets and accelerate their investigation of transcriptome and proteome by applying the server. The server is publicly available at <url>http://www.biosino.org/iGepros/</url>.</p

    Iron-related transcriptomic variations in Caco-2 cells: in silico perspectives.

    No full text
    International audienceThe iron absorption by duodenal enterocytes is a key step of its homeostasis. But the control of this absorption is complex and cannot be fully explicated with present knowledge. In a global transcriptome approach, we identified 60 genes over-expressed in hemin (iron) overload in Caco-2 cells, an in vitro model of duodenal enterocytes. The challenge from there was to identify the affected molecular mechanisms and achieve a biological interpretation for that cluster. In that purpose, we built up a functional annotation method combining evidence and literature. Our method identified four pathways in the Process hierarchy of the Gene Ontology (GO): lipid metabolism, amino acid and cofactor metabolism, response to stimulus and transport. The accuracy of this functional profile is supported by the identification of known pathways associated with the iron overload (response to oxidative stress, glutathione metabolism). But our method also suggests new hypotheses on the regulation of iron uptake in Caco-2 cells. It is hypothesized that plasma membrane remodeling and vesicular recycling could be a potential modulator of iron transport proteins activities. These assumptions yet require a biological validation and they will therefore direct further research. Our functional annotation method is a valuable tool designed to help the biologist understand the biological links between the genes of a cluster, elaborate working hypotheses and direct future work. This work is also a validation 'by hand' of a biomedical text-mining system

    Infectious Disease Ontology

    Get PDF
    Technological developments have resulted in tremendous increases in the volume and diversity of the data and information that must be processed in the course of biomedical and clinical research and practice. Researchers are at the same time under ever greater pressure to share data and to take steps to ensure that data resources are interoperable. The use of ontologies to annotate data has proven successful in supporting these goals and in providing new possibilities for the automated processing of data and information. In this chapter, we describe different types of vocabulary resources and emphasize those features of formal ontologies that make them most useful for computational applications. We describe current uses of ontologies and discuss future goals for ontology-based computing, focusing on its use in the field of infectious diseases. We review the largest and most widely used vocabulary resources relevant to the study of infectious diseases and conclude with a description of the Infectious Disease Ontology (IDO) suite of interoperable ontology modules that together cover the entire infectious disease domain
    • 

    corecore