33 research outputs found

    Manual GO annotation of predictive protein signatures: the InterPro approach to GO curation

    Get PDF
    InterPro amalgamates predictive protein signatures from a number of well-known partner databases into a single resource. To aid with interpretation of results, InterPro entries are manually annotated with terms from the Gene Ontology (GO). The InterPro2GO mappings are comprised of the cross-references between these two resources and are the largest source of GO annotation predictions for proteins. Here, we describe the protocol by which InterPro curators integrate GO terms into the InterPro database. We discuss the unique challenges involved in integrating specific GO terms with entries that may describe a diverse set of proteins, and we illustrate, with examples, how InterPro hierarchies reflect GO terms of increasing specificity. We describe a revised protocol for GO mapping that enables us to assign GO terms to domains based on the function of the individual domain, rather than the function of the families in which the domain is found. We also discuss how taxonomic constraints are dealt with and those cases where we are unable to add any appropriate GO terms. Expert manual annotation of InterPro entries with GO terms enables users to infer function, process or subcellular information for uncharacterized sequences based on sequence matches to predictive models

    The InterPro protein families database: the classification resource after 15 years.

    Get PDF
    The InterPro database (http://www.ebi.ac.uk/interpro/) is a freely available resource that can be used to classify sequences into protein families and to predict the presence of important domains and sites. Central to the InterPro database are predictive models, known as signatures, from a range of different protein family databases that have different biological focuses and use different methodological approaches to classify protein families and domains. InterPro integrates these signatures, capitalizing on the respective strengths of the individual databases, to produce a powerful protein classification resource. Here, we report on the status of InterPro as it enters its 15th year of operation, and give an overview of new developments with the database and its associated Web interfaces and software. In particular, the new domain architecture search tool is described and the process of mapping of Gene Ontology terms to InterPro is outlined. We also discuss the challenges faced by the resource given the explosive growth in sequence data in recent years. InterPro (version 48.0) contains 36 766 member database signatures integrated into 26 238 InterPro entries, an increase of over 3993 entries (5081 signatures), since 2012

    Characterization of a small cryptic plasmid from endophytic Pantoea agglomerans and its use in the construction of an expression vector

    Get PDF
    A circular cryptic plasmid named pPAGA (2,734 bp) was isolated from Pantoea agglomerans strain EGE6 (an endophytic bacterial isolate from eucalyptus). Sequence analysis revealed that the plasmid has a G+C content of 51% and contains four potential ORFs, 238(A), 250(B), 131(C), and 129(D) amino acids in length without homology to known proteins. The shuttle vector pLGM1 was constructed by combining the pPAGA plasmid with pGFPmut3.0 (which harbors a gene encoding green fluorescent protein, GFP), and the resulting construct was used to over-express GFP in E. coli and P. agglomerans cells. GFP production was used to monitor the colonization of strain EGE6gfp in various plant tissues by fluorescence microscopy. Analysis of EGE6gfp colonization showed that 14 days after inoculation, the strain occupied the inner tissue of Eucalyptus grandis roots, preferentially colonizing the xylem vessels of the host plants

    The InterPro protein families database: the classification resource after 15 years

    Get PDF
    The InterPro database (http://www.ebi.ac.uk/interpro/) is a freely available resource that can be used to classify sequences into protein families and to predict the presence of important domains and sites. Central to the InterPro database are predictive models, known as signatures, from a range of different protein family databases that have different biological focuses and use different methodological approaches to classify protein families and domains. InterPro integrates these signatures, capitalizing on the respective strengths of the individual databases, to produce a powerful protein classification resource. Here, we report on the status of InterPro as it enters its 15th year of operation, and give an overview of new developments with the database and its associated Web interfaces and software. In particular, the new domain architecture search tool is described and the process of mapping of Gene Ontology terms to InterPro is outlined. We also discuss the challenges faced by the resource given the explosive growth in sequence data in recent years. InterPro (version 48.0) contains 36 766 member database signatures integrated into 26 238 InterPro entries, an increase of over 3993 entries (5081 signatures), since 201

    InterPro in 2017-beyond protein family and domain annotations

    Get PDF
    InterPro (http://www.ebi.ac.uk/interpro/) is a freely available database used to classify protein sequences into families and to predict the presence of important domains and sites. InterProScan is the underlying software that allows both protein and nucleic acid sequences to be searched against InterPro's predictive models, which are provided by its member databases. Here, we report recent developments with InterPro and its associated software, including the addition of two new databases (SFLD and CDD), and the functionality to include residue-level annotation and prediction of intrinsic disorder. These developments enrich the annotations provided by InterPro, increase the overall number of residues annotated and allow more specific functional inferences

    InterPro in 2017-beyond protein family and domain annotations

    Get PDF
    InterPro (http://www.ebi.ac.uk/interpro/) is a freely available database used to classify protein sequences into families and to predict the presence of important domains and sites. InterProScan is the underlying software that allows both protein and nucleic acid sequences to be searched against InterPro's predictive models, which are provided by its member databases. Here, we report recent developments with InterPro and its associated software, including the addition of two new databases (SFLD and CDD), and the functionality to include residue-level annotation and prediction of intrinsic disorder. These developments enrich the annotations provided by InterPro, increase the overall number of residues annotated and allow more specific functional inferences

    InterPro in 2011: new developments in the family and domain prediction database

    Get PDF
    InterPro (http://www.ebi.ac.uk/interpro/) is a database that integrates diverse information about protein families, domains and functional sites, and makes it freely available to the public via Web-based interfaces and services. Central to the database are diagnostic models, known as signatures, against which protein sequences can be searched to determine their potential function. InterPro has utility in the large-scale analysis of whole genomes and meta-genomes, as well as in characterizing individual protein sequences. Herein we give an overview of new developments in the database and its associated software since 2009, including updates to database content, curation processes and Web and programmatic interface

    Fibroblasts Express Immune Relevant Genes and Are Important Sentinel Cells during Tissue Damage in Rainbow Trout (Oncorhynchus mykiss)

    Get PDF
    Fibroblasts have shown to be an immune competent cell type in mammals. However, little is known about the immunological functions of this cell-type in lower vertebrates. A rainbow trout hypodermal fibroblast cell-line (RTHDF) was shown to be responsive to PAMPs and DAMPs after stimulation with LPS from E. coli, supernatant and debris from sonicated RTHDF cells. LPS was overall the strongest inducer of IL-1β, IL-8, IL-10, TLR-3 and TLR-9. IL-1β and IL-8 were already highly up regulated after 1 hour of LPS stimulation. Supernatant stimuli significantly increased the expression of IL-1β, TLR-3 and TLR-9, whereas the debris stimuli only increased expression of IL-1β. Consequently, an in vivo experiment was further set up. By mechanically damaging the muscle tissue of rainbow trout, it was shown that fibroblasts in the muscle tissue of rainbow trout contribute to electing a highly local inflammatory response following tissue injury. The damaged muscle tissue showed a strong increase in the expression of the immune genes IL-1β, IL-8 and TGF-β already 4 hours post injury at the site of injury while the expression in non-damaged muscle tissue was not influenced. A weaker, but significant response was also seen for TLR-9 and TLR-22. Rainbow trout fibroblasts were found to be highly immune competent with a significant ability to express cytokines and immune receptors. Thus fish fibroblasts are believed to contribute significantly to local inflammatory reactions in concert with the traditional immune cells

    InterPro in 2019: improving coverage, classification and access to protein sequence annotations

    Get PDF
    The InterPro database (http://www.ebi.ac.uk/interpro/) classifies protein sequences into families and predicts the presence of functionally important domains and sites. Here, we report recent developments with InterPro (version 70.0) and its associated software, including an 18% growth in the size of the database in terms on new InterPro entries, updates to content, the inclusion of an additional entry type, refined modelling of discontinuous domains, and the development of a new programmatic interface and website. These developments extend and enrich the information provided by InterPro, and provide greater flexibility in terms of data access. We also show that InterPro's sequence coverage has kept pace with the growth of UniProtKB, and discuss how our evaluation of residue coverage may help guide future curation activities
    corecore