10 research outputs found

    The COMBREX Project: Design, Methodology, and Initial Results

    Get PDF
    © 2013 Brian P. et al.Prior to the “genomic era,” when the acquisition of DNA sequence involved significant labor and expense, the sequencing of genes was strongly linked to the experimental characterization of their products. Sequencing at that time directly resulted from the need to understand an experimentally determined phenotype or biochemical activity. Now that DNA sequencing has become orders of magnitude faster and less expensive, focus has shifted to sequencing entire genomes. Since biochemistry and genetics have not, by and large, enjoyed the same improvement of scale, public sequence repositories now predominantly contain putative protein sequences for which there is no direct experimental evidence of function. Computational approaches attempt to leverage evidence associated with the ever-smaller fraction of experimentally analyzed proteins to predict function for these putative proteins. Maximizing our understanding of function over the universe of proteins in toto requires not only robust computational methods of inference but also a judicious allocation of experimental resources, focusing on proteins whose experimental characterization will maximize the number and accuracy of follow-on predictions.COMBREX is funded by a GO grant from the National Institute of General Medical Sciences (NIGMS) (1RC2GM092602-01).Peer Reviewe

    Genome-related datasets within the E.coli

    No full text

    Version Management for Scientific Databases

    No full text
    . Scientific databases are used to accession objects representing the results of scientific inquiry, such as genes and DNA sequences. These objects must have stable identifiers that can be used as references in scientific papers and other databases. The requirement for stable object identifiers, however, conflicts with the tendency of scientific data to evolve over time. We present in this paper version management facilities that allow scientific databases to achieve a balance between stable object identifiers and evolving data. 1 Introduction Scientific databases are increasingly being used to accession objects representing the results of scientific inquiry, such as genes and DNA sequences. Accessioning involves providing these objects with stable identifiers that can be included as references in publications (e.g., journal papers) and other scientific databases. This requirement for stable object identifiers can conflict with the tendency of scientific data to evolve over time. By t..

    BRCA Share: A Collection of Clinical BRCA Gene Variants

    No full text
    International audienceAs next-generation sequencing increases access to human genetic variation, the challenge of determining clinical significance of variants becomes ever more acute. Germline variants in the BRCA1 and BRCA2 genes can confer substantial lifetime risk of breast and ovar-ian cancer. Assessment of variant pathogenicity is a vital part of clinical genetic testing for these genes. A database Additional Supporting Information may be found in the online version of this article. † These authors contributed equally to this work. ‡ This author is deceased. of clinical observations of BRCA variants is a critical resource in that process. This article describes BRCA Share TM , a database created by a unique international alliance of academic centers and commercial testing laboratories. By integrating the content of the Universal Mutation Database generated by the French Unicancer Genetic Group with the testing results of two large commercial laboratories, Quest Diagnostics and Laboratory Corporation of America (LabCorp), BRCA Share TM has assembled one of the largest publicly accessible collections of BRCA variants currently available. Although access is available to academic researchers without charge, commercial participants in the project are required to pay a support fee and contribute their data. The fees fund the ongoing cu-ration effort, as well as planned experiments to functionally characterize variants of uncertain significance. BRCA Share TM databases can therefore be considered as models of successful data sharing between private companies and the academic world

    The COMBREX project: design, methodology, and initial results.

    Get PDF
    Experimental data exists for only a vanishingly small fraction of sequenced microbial genes. This community page discusses the progress made by the COMBREX project to address this important issue using both computational and experimental resources

    Schematic overview of the computational and experimental contributions of COMBREX and its users, and the interrelationships of these contributions.

    No full text
    <p>Data and results specific to COMBREX are shown in boxes. External data imported into COMBREX are also shown, with arrows indicating entry points into the cycle. Methodology employed by COMBREX and its users is shown in blue type, as it is used to generate data. Not shown are two critical contributions to COMBREX: genome and cluster data imported from NCBI RefSeq and ProtClustDB, respectively, and NIH funding, which enables the grants that COMBREX issues to experimental laboratories.</p

    Definitions of COMBREX functional status symbols and fractions of microbial genes in COMBREX in each status category.

    No full text
    <p>Experimentally characterized proteins are <i>green</i>. (Those in the <i>green</i> set that have been manually curated by the GSDB are also marked with a gold “G.”) Proteins with functional predictions but no experimental evidence are <i>blue</i>. Proteins with no available functional predictions are <i>black</i>.</p
    corecore