9 research outputs found

    The COMBREX Project: Design, Methodology, and Initial Results

    Get PDF
    © 2013 Brian P. et al.Prior to the “genomic era,” when the acquisition of DNA sequence involved significant labor and expense, the sequencing of genes was strongly linked to the experimental characterization of their products. Sequencing at that time directly resulted from the need to understand an experimentally determined phenotype or biochemical activity. Now that DNA sequencing has become orders of magnitude faster and less expensive, focus has shifted to sequencing entire genomes. Since biochemistry and genetics have not, by and large, enjoyed the same improvement of scale, public sequence repositories now predominantly contain putative protein sequences for which there is no direct experimental evidence of function. Computational approaches attempt to leverage evidence associated with the ever-smaller fraction of experimentally analyzed proteins to predict function for these putative proteins. Maximizing our understanding of function over the universe of proteins in toto requires not only robust computational methods of inference but also a judicious allocation of experimental resources, focusing on proteins whose experimental characterization will maximize the number and accuracy of follow-on predictions.COMBREX is funded by a GO grant from the National Institute of General Medical Sciences (NIGMS) (1RC2GM092602-01).Peer Reviewe

    Thousands of missed genes found in bacterial genomes and their analysis with COMBREX

    Get PDF
    The dramatic reduction in the cost of sequencing has allowed many researchers to join in the effort of sequencing and annotating prokaryotic genomes. Annotation methods vary considerably and may fail to identify some genes. Here we draw attention to a large number of likely genes missing from annotations using common tools such as Glimmer and BLAST. By analyzing 1,474 prokaryotic genome annotations in GenBank, we identify 13,602 likely missed genes that are homologs to non-hypothetical proteins, and 11,792 likely missed genes that are homologs only to hypothetical proteins, yet have supporting evidence of their protein-coding nature from COMBREX, a newly created gene function database. We also estimate the likelihood that each potential missing gene found is a genuine protein-coding gene using COMBREX. Our analysis of the causes of missed genes suggests that larger annotation centers tend to produce annotations with fewer missed genes than smaller centers, and many of the missed genes are short genes <300 bp. Over 1,000 of the likely missed genes could be associated with phenotype information available in COMBREX. 359 of these genes, found in pathogenic organisms, may be potential targets for pharmaceutical research. The newly identified genes are available on COMBREX’s website.https://doi.org/10.1186/1745-6150-7-3

    Aging gene signature of memory CD8+ T cells is associated with neurocognitive functioning in Alzheimer’s disease

    No full text
    Abstract Background Memory CD8+ T cells expand with age. We previously demonstrated an age-associated expansion of effector memory (EM) CD8+ T cells expressing low levels of IL-7 receptor alpha (IL-7Rαlow) and the presence of its gene signature (i.e., IL-7Rαlow aging genes) in peripheral blood of older adults without Alzheimer’s disease (AD). Considering age as the strongest risk factor for AD and the recent finding of EM CD8+ T cell expansion, mostly IL-7Rαlow cells, in AD, we investigated whether subjects with AD have alterations in IL-7Rαlow aging gene signature, especially in relation to genes possibly associated with AD and disease severity. Results We identified a set of 29 candidate genes (i.e., putative AD genes) which could be differentially expressed in peripheral blood of patients with AD through the systematic search of publicly available datasets. Of the 29 putative AD genes, 9 genes (31%) were IL-7Rαlow aging genes (P < 0.001), suggesting the possible implication of IL-7Rαlow aging genes in AD. These findings were validated by RT-qPCR analysis of 40 genes, including 29 putative AD genes, additional 9 top IL-7R⍺low aging but not the putative AD genes, and 2 inflammatory control genes in peripheral blood of cognitively normal persons (CN, 38 subjects) and patients with AD (40 mild cognitive impairment and 43 dementia subjects). The RT-qPCR results showed 8 differentially expressed genes between AD and CN groups; five (62.5%) of which were top IL-7Rαlow aging genes (FGFBP2, GZMH, NUAK1, PRSS23, TGFBR3) not previously reported to be altered in AD. Unbiased clustering analysis revealed 3 clusters of dementia patients with distinct expression levels of the 40 analyzed genes, including IL-7Rαlow aging genes, which were associated with neurocognitive function as determined by MoCA, CDRsob and neuropsychological testing. Conclusions We report differential expression of “normal” aging genes associated with IL‐7Rαlow EM CD8+ T cells in peripheral blood of patients with AD, and the significance of such gene expression in clustering subjects with dementia due to AD into groups with different levels of cognitive functioning. These results provide a platform for studies investigating the possible implications of age-related immune changes, including those associated with CD8+ T cells, in AD

    The COMBREX project: design, methodology, and initial results.

    Get PDF
    Experimental data exists for only a vanishingly small fraction of sequenced microbial genes. This community page discusses the progress made by the COMBREX project to address this important issue using both computational and experimental resources

    Definitions of COMBREX functional status symbols and fractions of microbial genes in COMBREX in each status category.

    No full text
    <p>Experimentally characterized proteins are <i>green</i>. (Those in the <i>green</i> set that have been manually curated by the GSDB are also marked with a gold “G.”) Proteins with functional predictions but no experimental evidence are <i>blue</i>. Proteins with no available functional predictions are <i>black</i>.</p

    Schematic overview of the computational and experimental contributions of COMBREX and its users, and the interrelationships of these contributions.

    No full text
    <p>Data and results specific to COMBREX are shown in boxes. External data imported into COMBREX are also shown, with arrows indicating entry points into the cycle. Methodology employed by COMBREX and its users is shown in blue type, as it is used to generate data. Not shown are two critical contributions to COMBREX: genome and cluster data imported from NCBI RefSeq and ProtClustDB, respectively, and NIH funding, which enables the grants that COMBREX issues to experimental laboratories.</p
    corecore