2,195 research outputs found

    RETINOBASE: a web database, data mining and analysis platform for gene expression data on retina

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The retina is a multi-layered sensory tissue that lines the back of the eye and acts at the interface of input light and visual perception. Its main function is to capture photons and convert them into electrical impulses that travel along the optic nerve to the brain where they are turned into images. It consists of neurons, nourishing blood vessels and different cell types, of which neural cells predominate. Defects in any of these cells can lead to a variety of retinal diseases, including age-related macular degeneration, retinitis pigmentosa, Leber congenital amaurosis and glaucoma. Recent progress in genomics and microarray technology provides extensive opportunities to examine alterations in retinal gene expression profiles during development and diseases. However, there is no specific database that deals with retinal gene expression profiling. In this context we have built RETINOBASE, a dedicated microarray database for retina.</p> <p>Description</p> <p>RETINOBASE is a microarray relational database, analysis and visualization system that allows simple yet powerful queries to retrieve information about gene expression in retina. It provides access to gene expression meta-data and offers significant insights into gene networks in retina, resulting in better hypothesis framing for biological problems that can subsequently be tested in the laboratory. Public and proprietary data are automatically analyzed with 3 distinct methods, RMA, dChip and MAS5, then clustered using 2 different K-means and 1 mixture models method. Thus, RETINOBASE provides a framework to compare these methods and to optimize the retinal data analysis. RETINOBASE has three different modules, "Gene Information", "Raw Data System Analysis" and "Fold change system Analysis" that are interconnected in a relational schema, allowing efficient retrieval and cross comparison of data. Currently, RETINOBASE contains datasets from 28 different microarray experiments performed in 5 different model systems: drosophila, zebrafish, rat, mouse and human. The database is supported by a platform that is designed to easily integrate new functionalities and is also frequently updated.</p> <p>Conclusion</p> <p>The results obtained from various biological scenarios can be visualized, compared and downloaded. The results of a case study are presented that highlight the utility of RETINOBASE. Overall, RETINOBASE provides efficient access to the global expression profiling of retinal genes from different organisms under various conditions.</p

    Ontology-based knowledge representation of experiment metadata in biological data mining

    Get PDF
    According to the PubMed resource from the U.S. National Library of Medicine, over 750,000 scientific articles have been published in the ~5000 biomedical journals worldwide in the year 2007 alone. The vast majority of these publications include results from hypothesis-driven experimentation in overlapping biomedical research domains. Unfortunately, the sheer volume of information being generated by the biomedical research enterprise has made it virtually impossible for investigators to stay aware of the latest findings in their domain of interest, let alone to be able to assimilate and mine data from related investigations for purposes of meta-analysis. While computers have the potential for assisting investigators in the extraction, management and analysis of these data, information contained in the traditional journal publication is still largely unstructured, free-text descriptions of study design, experimental application and results interpretation, making it difficult for computers to gain access to the content of what is being conveyed without significant manual intervention. In order to circumvent these roadblocks and make the most of the output from the biomedical research enterprise, a variety of related standards in knowledge representation are being developed, proposed and adopted in the biomedical community. In this chapter, we will explore the current status of efforts to develop minimum information standards for the representation of a biomedical experiment, ontologies composed of shared vocabularies assembled into subsumption hierarchical structures, and extensible relational data models that link the information components together in a machine-readable and human-useable framework for data mining purposes

    NCBI GEO: mining millions of expression profiles—database and tools

    Get PDF
    The Gene Expression Omnibus (GEO) at the National Center for Biotechnology Information (NCBI) is the largest fully public repository for high-throughput molecular abundance data, primarily gene expression data. The database has a flexible and open design that allows the submission, storage and retrieval of many data types. These data include microarray-based experiments measuring the abundance of mRNA, genomic DNA and protein molecules, as well as non-array-based technologies such as serial analysis of gene expression (SAGE) and mass spectrometry proteomic technology. GEO currently holds over 30 000 submissions representing approximately half a billion individual molecular abundance measurements, for over 100 organisms. Here, we describe recent database developments that facilitate effective mining and visualization of these data. Features are provided to examine data from both experiment- and gene-centric perspectives using user-friendly Web-based interfaces accessible to those without computational or microarray-related analytical expertise. The GEO database is publicly accessible through the World Wide Web at http://www.ncbi.nlm.nih.gov/geo

    Comparative evaluation of microarray-based gene expression databases

    Get PDF
    Microarrays make it possible to monitor the expression of thousands of genes in parallel thus generating huge amounts of data. So far, several databases have been developed for managing and analyzing this kind of data but the current state of the art in this field is still early stage. In this paper, we comprehensively analyze the requirements for microarray data management. We consider the various kinds of data involved as well as data preparation, integration and analysis needs. The identified requirements are then used to comparatively evaluate eight existing microarray databases described in the literature. In addition to providing an overview of the current state of the art we identify problems that should be addressed in the future to obtain better solutions for managing and analyzing microarray data

    ToxoDB: an integrated Toxoplasma gondii database resource

    Get PDF
    ToxoDB (http://ToxoDB.org) is a genome and functional genomic database for the protozoan parasite Toxoplasma gondii. It incorporates the sequence and annotation of the T. gondii ME49 strain, as well as genome sequences for the GT1, VEG and RH (Chr Ia, Chr Ib) strains. Sequence information is integrated with various other genomic-scale data, including community annotation, ESTs, gene expression and proteomics data. ToxoDB has matured significantly since its initial release. Here we outline the numerous updates with respect to the data and increased functionality available on the website

    Emerging model spedies driven by transciptomics

    Get PDF
    This work is focused on 'emerging model species', i.e. question-driven model species which have sufficient molecular resources to investigate a specific phenomenon in molecular biology, developmental biology, molecular ecology and evolution or related molecular fields. This thesis shows how transcriptomic data can be generated, analyzed, and used to investigate such phenomena of interest even in species lacking a reference genome. The initial ButterflyBase resource has proven to be useful to researchers of species without a reference genome but is limited to the Lepidoptera and supports only the older Sanger sequencing technologies. Thanks to Next Generation Sequencing, transcriptome sequencing is more cost effective but the bottleneck of transcriptomic projects is now the bioinformatic analysis and data mining/dissemination. Therefore, this work continues with presenting novel and innovative approaches which effectively overcome this bottleneck. The est2assembly software produces deeply annotated reference transcriptomes stored in the Chado database. The Drupal Bioinformatic Server Framework and genes4all provide species-neutral and an innovative approach in building standardized online databases and associated web services. All public insect mRNA data were analyzed with est2assembly and genes4all to produce the InsectaCentral. With InsectaCentral, a powerful resource is now available to assist molecular biology in any question-driven model insect species. The software presented here was developed according to specifications of the General Model Organism Database (GMOD) community. All software specifications are species-neutral and can be seamlessly deployed to assist any research community. Further through a case studies chapter, it becomes apparent that the transcriptomic approach is more cost-effective than a genomic approach and therefore sequence-driven evolutionary biology will benefit faster with this field

    The Genopolis Microarray Database

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Gene expression databases are key resources for microarray data management and analysis and the importance of a proper annotation of their content is well understood.</p> <p>Public repositories as well as microarray database systems that can be implemented by single laboratories exist. However, there is not yet a tool that can easily support a collaborative environment where different users with different rights of access to data can interact to define a common highly coherent content. The scope of the Genopolis database is to provide a resource that allows different groups performing microarray experiments related to a common subject to create a common coherent knowledge base and to analyse it. The Genopolis database has been implemented as a dedicated system for the scientific community studying dendritic and macrophage cells functions and host-parasite interactions.</p> <p>Results</p> <p>The Genopolis Database system allows the community to build an object based MIAME compliant annotation of their experiments and to store images, raw and processed data from the Affymetrix GeneChip<sup>® </sup>platform. It supports dynamical definition of controlled vocabularies and provides automated and supervised steps to control the coherence of data and annotations. It allows a precise control of the visibility of the database content to different sub groups in the community and facilitates exports of its content to public repositories. It provides an interactive users interface for data analysis: this allows users to visualize data matrices based on functional lists and sample characterization, and to navigate to other data matrices defined by similarity of expression values as well as functional characterizations of genes involved. A collaborative environment is also provided for the definition and sharing of functional annotation by users.</p> <p>Conclusion</p> <p>The Genopolis Database supports a community in building a common coherent knowledge base and analyse it. This fills a gap between a local database and a public repository, where the development of a common coherent annotation is important. In its current implementation, it provides a uniform coherently annotated dataset on dendritic cells and macrophage differentiation.</p
    • …
    corecore