951 research outputs found

    CDD: specific functional annotation with the Conserved Domain Database

    Get PDF
    NCBI's Conserved Domain Database (CDD) is a collection of multiple sequence alignments and derived database search models, which represent protein domains conserved in molecular evolution. The collection can be accessed at http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml, and is also part of NCBI's Entrez query and retrieval system, cross-linked to numerous other resources. CDD provides annotation of domain footprints and conserved functional sites on protein sequences. Precalculated domain annotation can be retrieved for protein sequences tracked in NCBI's Entrez system, and CDD's collection of models can be queried with novel protein sequences via the CD-Search service at http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi. Starting with the latest version of CDD, v2.14, information from redundant and homologous domain models is summarized at a superfamily level, and domain annotation on proteins is flagged as either ‘specific’ (identifying molecular function with high confidence) or as ‘non-specific’ (identifying superfamily membership only)

    Genomic and transcriptional analysis of protein heterogeneity of the honeybee venom allergen Api m 6

    Get PDF
    Several components of honeybee venom are known to cause allergenic responses in humans and other vertebrates. One such component, the minor allergen Api m 6, has been known to show amino acid variation but the genetic mechanism for this variation is unknown. Here we show that Api m 6 is derived from a single locus, and that substantial protein-level variation has a simple genome-level cause, without the need to invoke multiple loci or alternatively spliced exons. Api m 6 sits near a misassembled section of the honeybee genome sequence, and we propose that a substantial number of indels at and near Api m 6 might be the root cause of this misassembly. We suggest that genes such as Api m 6 with coding-region or untranslated region indels might have had a strong effect on the assembly of this draft of the honeybee genome

    CDD: a Conserved Domain Database for the functional annotation of proteins

    Get PDF
    NCBI’s Conserved Domain Database (CDD) is a resource for the annotation of protein sequences with the location of conserved domain footprints, and functional sites inferred from these footprints. CDD includes manually curated domain models that make use of protein 3D structure to refine domain models and provide insights into sequence/structure/function relationships. Manually curated models are organized hierarchically if they describe domain families that are clearly related by common descent. As CDD also imports domain family models from a variety of external sources, it is a partially redundant collection. To simplify protein annotation, redundant models and models describing homologous families are clustered into superfamilies. By default, domain footprints are annotated with the corresponding superfamily designation, on top of which specific annotation may indicate high-confidence assignment of family membership. Pre-computed domain annotation is available for proteins in the Entrez/Protein dataset, and a novel interface, Batch CD-Search, allows the computation and download of annotation for large sets of protein queries. CDD can be accessed via http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml

    The Phyre2 web portal for protein modeling, prediction and analysis

    Get PDF
    Phyre2 is a suite of tools available on the web to predict and analyze protein structure, function and mutations. The focus of Phyre2 is to provide biologists with a simple and intuitive interface to state-of-the-art protein bioinformatics tools. Phyre2 replaces Phyre, the original version of the server for which we previously published a paper in Nature Protocols. In this updated protocol, we describe Phyre2, which uses advanced remote homology detection methods to build 3D models, predict ligand binding sites and analyze the effect of amino acid variants (e.g., nonsynonymous SNPs (nsSNPs)) for a user's protein sequence. Users are guided through results by a simple interface at a level of detail they determine. This protocol will guide users from submitting a protein sequence to interpreting the secondary and tertiary structure of their models, their domain composition and model quality. A range of additional available tools is described to find a protein structure in a genome, to submit large number of sequences at once and to automatically run weekly searches for proteins that are difficult to model. The server is available at http://www.sbg.bio.ic.ac.uk/phyre2. A typical structure prediction will be returned between 30 min and 2 h after submission

    Anaerobic Carbon Monoxide Dehydrogenase Diversity in the Homoacetogenic Hindgut Microbial Communities of Lower Termites and the Wood Roach

    Get PDF
    Anaerobic carbon monoxide dehydrogenase (CODH) is a key enzyme in the Wood-Ljungdahl (acetyl-CoA) pathway for acetogenesis performed by homoacetogenic bacteria. Acetate generated by gut bacteria via the acetyl-CoA pathway provides considerable nutrition to wood-feeding dictyopteran insects making CODH important to the obligate mutualism occurring between termites and their hindgut microbiota. To investigate CODH diversity in insect gut communities, we developed the first degenerate primers designed to amplify cooS genes, which encode the catalytic (β) subunit of anaerobic CODH enzyme complexes. These primers target over 68 million combinations of potential forward and reverse cooS primer-binding sequences. We used the primers to identify cooS genes in bacterial isolates from the hindgut of a phylogenetically lower termite and to sample cooS diversity present in a variety of insect hindgut microbial communities including those of three phylogenetically-lower termites, Zootermopsis nevadensis, Reticulitermes hesperus, and Incisitermes minor, a wood-feeding cockroach, Cryptocercus punctulatus, and an omnivorous cockroach, Periplaneta americana. In total, we sequenced and analyzed 151 different cooS genes. These genes encode proteins that group within one of three highly divergent CODH phylogenetic clades. Each insect gut community contained CODH variants from all three of these clades. The patterns of CODH diversity in these communities likely reflect differences in enzyme or physiological function, and suggest that a diversity of microbial species participate in homoacetogenesis in these communities

    Threshold Average Precision (TAP-k): a measure of retrieval designed for bioinformatics

    Get PDF
    Motivation: Since database retrieval is a fundamental operation, the measurement of retrieval efficacy is critical to progress in bioinformatics. This article points out some issues with current methods of measuring retrieval efficacy and suggests some improvements. In particular, many studies have used the pooled receiver operating characteristic for n irrelevant records (ROCn) score, the area under the ROC curve (AUC) of a ‘pooled’ ROC curve, truncated at n irrelevant records. Unfortunately, the pooled ROCn score does not faithfully reflect actual usage of retrieval algorithms. Additionally, a pooled ROCn score can be very sensitive to retrieval results from as little as a single query

    SHARPIN Negatively Associates with TRAF2-Mediated NFκB Activation

    Get PDF
    NFκB is an inducible transcriptional factor controlled by two principal signaling cascades and plays pivotal roles in diverse physiological processes including inflammation, apoptosis, oncogenesis, immunity, and development. Activation of NFκB signaling was detected in skin of SHAPRIN-deficient mice and can be diminished by an NFκB inhibitor. However, in vitro studies demonstrated that SHARPIN activates NFκB signaling by forming a linear ubiquitin chain assembly complex with RNF31 (HOIP) and RBCK1 (HOIL1). The inconsistency between in vivo and in vitro findings about SHARPIN's function on NFκB activation could be partially due to SHARPIN's potential interactions with downstream molecules of NFκB pathway. In this study, 17 anti-flag immunoprecipitated proteins, including TRAF2, were identified by mass spectrum analysis among Sharpin-Flag transfected mouse fibroblasts, B lymphocytes, and BALB/c LN stroma 12 cells suggesting their interaction with SHARPIN. Interaction between SHARPIN and TRAF2 confirmed previous yeast two hybridization reports that SHARPIN was one TRAF2's partners. Furthermore, luciferase-based NFκB reporter assays demonstrated that SHARPIN negatively associates with NFκB activation, which can be partly compensated by over-expression of TRAF2. These data suggested that other than activating NFκB signaling by forming ubiquitin ligase complex with RNF31 and RBCK1, SHARPIN may also negatively associate with NFκB activation via interactions with other NFκB members, such as TRAF2

    Large introns in relation to alternative splicing and gene evolution: a case study of Drosophila bruno-3

    Get PDF
    Background: Alternative splicing (AS) of maturing mRNA can generate structurally and functionally distinct transcripts from the same gene. Recent bioinformatic analyses of available genome databases inferred a positive correlation between intron length and AS. To study the interplay between intron length and AS empirically and in more detail, we analyzed the diversity of alternatively spliced transcripts (ASTs) in the Drosophila RNA-binding Bruno-3 (Bru-3) gene. This gene was known to encode thirteen exons separated by introns of diverse sizes, ranging from 71 to 41,973 nucleotides in D. melanogaster. Although Bru-3's structure is expected to be conducive to AS, only two ASTs of this gene were previously described. Results: Cloning of RT-PCR products of the entire ORF from four species representing three diverged Drosophila lineages provided an evolutionary perspective, high sensitivity, and long-range contiguity of splice choices currently unattainable by high-throughput methods. Consequently, we identified three new exons, a new exon fragment and thirty-three previously unknown ASTs of Bru-3. All exon-skipping events in the gene were mapped to the exons surrounded by introns of at least 800 nucleotides, whereas exons split by introns of less than 250 nucleotides were always spliced contiguously in mRNA. Cases of exon loss and creation during Bru-3 evolution in Drosophila were also localized within large introns. Notably, we identified a true de novo exon gain: exon 8 was created along the lineage of the obscura group from intronic sequence between cryptic splice sites conserved among all Drosophila species surveyed. Exon 8 was included in mature mRNA by the species representing all the major branches of the obscura group. To our knowledge, the origin of exon 8 is the first documented case of exonization of intronic sequence outside vertebrates. Conclusion: We found that large introns can promote AS via exon-skipping and exon turnover during evolution likely due to frequent errors in their removal from maturing mRNA. Large introns could be a reservoir of genetic diversity, because they have a greater number of mutable sites than short introns. Taken together, gene structure can constrain and/or promote gene evolution
    corecore