6 research outputs found
The COMBREX Project: Design, Methodology, and Initial Results
© 2013 Brian P. et al.Prior to the “genomic era,” when the acquisition of DNA sequence involved significant labor and expense, the sequencing of genes was strongly linked to the experimental characterization of their products. Sequencing at that time directly resulted from the need to understand an experimentally determined phenotype or biochemical activity. Now that DNA sequencing has become orders of magnitude faster and less expensive, focus has shifted to sequencing entire genomes. Since biochemistry and genetics have not, by and large, enjoyed the same improvement of scale, public sequence repositories now predominantly contain putative protein sequences for which there is no direct experimental evidence of function. Computational approaches attempt to leverage evidence associated with the ever-smaller fraction of experimentally analyzed proteins to predict function for these putative proteins. Maximizing our understanding of function over the universe of proteins in toto requires not only robust computational methods of inference but also a judicious allocation of experimental resources, focusing on proteins whose experimental characterization will maximize the number and accuracy of follow-on predictions.COMBREX is funded by a GO grant from the National Institute of General Medical Sciences (NIGMS) (1RC2GM092602-01).Peer Reviewe
The COMBREX project: design, methodology, and initial results.
Experimental data exists for only a vanishingly small fraction of sequenced microbial genes. This community page discusses the progress made by the COMBREX project to address this important issue using both computational and experimental resources
Schematic overview of the computational and experimental contributions of COMBREX and its users, and the interrelationships of these contributions.
<p>Data and results specific to COMBREX are shown in boxes. External data imported into COMBREX are also shown, with arrows indicating entry points into the cycle. Methodology employed by COMBREX and its users is shown in blue type, as it is used to generate data. Not shown are two critical contributions to COMBREX: genome and cluster data imported from NCBI RefSeq and ProtClustDB, respectively, and NIH funding, which enables the grants that COMBREX issues to experimental laboratories.</p
Definitions of COMBREX functional status symbols and fractions of microbial genes in COMBREX in each status category.
<p>Experimentally characterized proteins are <i>green</i>. (Those in the <i>green</i> set that have been manually curated by the GSDB are also marked with a gold “G.”) Proteins with functional predictions but no experimental evidence are <i>blue</i>. Proteins with no available functional predictions are <i>black</i>.</p