353 research outputs found

    Two new ArrayTrack libraries for personalized biomedical research

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Recent advances in high-throughput genotyping technology are paving the way for research in personalized medicine and nutrition. However, most of the genetic markers identified from association studies account for a small contribution to the total risk/benefit of the studied phenotypic trait. Testing whether the candidate genes identified by association studies are causal is critically important to the development of personalized medicine and nutrition. An efficient data mining strategy and a set of sophisticated tools are necessary to help better understand and utilize the findings from genetic association studies. </p> <p>Description</p> <p>SNP (single nucleotide polymorphism) and QTL (quantitative trait locus) libraries were constructed and incorporated into ArrayTrack, with user-friendly interfaces and powerful search features. Data from several public repositories were collected in the SNP and QTL libraries and connected to other domain libraries (genes, proteins, metabolites, and pathways) in ArrayTrack. Linking the data sets within ArrayTrack allows searching of SNP and QTL data as well as their relationships to other biological molecules. The SNP library includes approximately 15 million human SNPs and their annotations, while the QTL library contains publically available QTLs identified in mouse, rat, and human. The QTL library was developed for finding the overlap between the map position of a candidate or metabolic gene and QTLs from these species. Two use cases were included to demonstrate the utility of these tools. The SNP and QTL libraries are freely available to the public through ArrayTrack at <url>http://www.fda.gov/ArrayTrack</url>. </p> <p>Conclusions</p> <p>These libraries developed in ArrayTrack contain comprehensive information on SNPs and QTLs and are further cross-linked to other libraries. Connecting domain specific knowledge is a cornerstone of systems biology strategies and allows for a better understanding of the genetic and biological context of the findings from genetic association studies. </p

    T1DBase: update 2011, organization and presentation of large-scale data sets for type 1 diabetes research

    Get PDF
    T1DBase (http://www.t1dbase.org) is web platform, which supports the type 1 diabetes (T1D) community. It integrates genetic, genomic and expression data relevant to T1D research across mouse, rat and human and presents this to the user as a set of web pages and tools. This update describes the incorporation of new data sets, tools and curation efforts as well as a new website design to simplify site use. New data sets include curated summary data from four genome-wide association studies relevant to T1D, HaemAtlas—a data set and tool to query gene expression levels in haematopoietic cells and a manually curated table of human T1D susceptibility loci, incorporating genetic overlap with other related diseases. These developments will continue to support T1D research and allow easy access to large and complex T1D relevant data sets

    PhenoGO: an integrated resource for the multiscale mining of clinical and biological data

    Get PDF
    The evolving complexity of genome-scale experiments has increasingly centralized the role of a highly computable, accurate, and comprehensive resource spanning multiple biological scales and viewpoints. To provide a resource to meet this need, we have significantly extended the PhenoGO database with gene-disease specific annotations and included an additional ten species. This a computationally-derived resource is primarily intended to provide phenotypic context (cell type, tissue, organ, and disease) for mining existing associations between gene products and GO terms specified in the Gene Ontology Databases Automated natural language processing (BioMedLEE) and computational ontology (PhenOS) methods were used to derive these relationships from the literature, expanding the database with information from ten additional species to include over 600,000 phenotypic contexts spanning eleven species from five GO annotation databases. A comprehensive evaluation evaluating the mappings (n = 300) found precision (positive predictive value) at 85%, and recall (sensitivity) at 76%. Phenotypes are encoded in general purpose ontologies such as Cell Ontology, the Unified Medical Language System, and in specialized ontologies such as the Mouse Anatomy and the Mammalian Phenotype Ontology. A web portal has also been developed, allowing for advanced filtering and querying of the database as well as download of the entire dataset

    genenames.org: the HGNC resources in 2011

    Get PDF
    The HUGO Gene Nomenclature Committee (HGNC) aims to assign a unique gene symbol and name to every human gene. The HGNC database currently contains almost 30 000 approved gene symbols, over 19 000 of which represent protein-coding genes. The public website, www.genenames.org, displays all approved nomenclature within Symbol Reports that contain data curated by HGNC editors and links to related genomic, phenotypic and proteomic information. Here we describe improvements to our resources, including a new Quick Gene Search, a new List Search, an integrated HGNC BioMart and a new Statistics and Downloads facility

    Disease Ontology: a backbone for disease semantic integration

    Get PDF
    The Disease Ontology (DO) database (http://disease-ontology.org) represents a comprehensive knowledge base of 8043 inherited, developmental and acquired human diseases (DO version 3, revision 2510). The DO web browser has been designed for speed, efficiency and robustness through the use of a graph database. Full-text contextual searching functionality using Lucene allows the querying of name, synonym, definition, DOID and cross-reference (xrefs) with complex Boolean search strings. The DO semantically integrates disease and medical vocabularies through extensive cross mapping and integration of MeSH, ICD, NCI's thesaurus, SNOMED CT and OMIM disease-specific terms and identifiers. The DO is utilized for disease annotation by major biomedical databases (e.g. Array Express, NIF, IEDB), as a standard representation of human disease in biomedical ontologies (e.g. IDO, Cell line ontology, NIFSTD ontology, Experimental Factor Ontology, Influenza Ontology), and as an ontological cross mappings resource between DO, MeSH and OMIM (e.g. GeneWiki). The DO project (http://diseaseontology.sf.net) has been incorporated into open source tools (e.g. Gene Answers, FunDO) to connect gene and disease biomedical data through the lens of human disease. The next iteration of the DO web browser will integrate DO's extended relations and logical definition representation along with these biomedical resource cross-mappings

    IUPHAR-DB: new receptors and tools for easy searching and visualization of pharmacological data

    Get PDF
    The IUPHAR database is an established online reference resource for several important classes of human drug targets and related proteins. As well as providing recommended nomenclature, the database integrates information on the chemical, genetic, functional and pathophysiological properties of receptors and ion channels, curated and peer-reviewed from the biomedical literature by a network of experts. The database now includes information on 616 gene products from four superfamilies in human and rodent model organisms: G protein-coupled receptors, voltage- and ligand-gated ion channels and, in a recent update, 49 nuclear hormone receptors (NHRs). New data types for NHRs include details on co-regulators, DNA binding motifs, target genes and 3D structures. Other recent developments include curation of the chemical structures of approximately 2000 ligand molecules, providing electronic descriptors, identifiers, link-outs and calculated molecular properties, all available via enhanced ligand pages. The interface now provides intelligent tools for the visualization and exploration of ligand structure-activity relationships and the structural diversity of compounds active at each target. The database is freely available at http://www.iuphar-db.org

    GeneWeaver: a web-based system for integrative functional genomics

    Get PDF
    High-throughput genome technologies have produced a wealth of data on the association of genes and gene products to biological functions. Investigators have discovered value in combining their experimental results with published genome-wide association studies, quantitative trait locus, microarray, RNA-sequencing and mutant phenotyping studies to identify gene-function associations across diverse experiments, species, conditions, behaviors or biological processes. These experimental results are typically derived from disparate data repositories, publication supplements or reconstructions from primary data stores. This leaves bench biologists with the complex and unscalable task of integrating data by identifying and gathering relevant studies, reanalyzing primary data, unifying gene identifiers and applying ad hoc computational analysis to the integrated set. The freely available GeneWeaver (http://www.GeneWeaver.org) powered by the Ontological Discovery Environment is a curated repository of genomic experimental results with an accompanying tool set for dynamic integration of these data sets, enabling users to interactively address questions about sets of biological functions and their relations to sets of genes. Thus, large numbers of independently published genomic results can be organized into new conceptual frameworks driven by the underlying, inferred biological relationships rather than a pre-existing semantic framework. An empirical ‘ontology’ is discovered from the aggregate of experimental knowledge around user-defined areas of biological inquiry

    A Gene Wiki for Community Annotation of Gene Function

    Get PDF
    This manuscript describes the creation of comprehensive gene wiki, seeded with data from public domain sources, which will enable and encourage community annotation of gene function

    The UCSC Genome Browser Database: update 2009

    Get PDF
    The UCSC Genome Browser Database (GBD, http://genome.ucsc.edu) is a publicly available collection of genome assembly sequence data and integrated annotations for a large number of organisms, including extensive comparative-genomic resources. In the past year, 13 new genome assemblies have been added, including two important primate species, orangutan and marmoset, bringing the total to 46 assemblies for 24 different vertebrates and 39 assemblies for 22 different invertebrate animals. The GBD datasets may be viewed graphically with the UCSC Genome Browser, which uses a coordinate-based display system allowing users to juxtapose a wide variety of data. These data include all mRNAs from GenBank mapped to all organisms, RefSeq alignments, gene predictions, regulatory elements, gene expression data, repeats, SNPs and other variation data, as well as pairwise and multiple-genome alignments. A variety of other bioinformatics tools are also provided, including BLAT, the Table Browser, the Gene Sorter, the Proteome Browser, VisiGene and Genome Graphs
    corecore