21 research outputs found

    CIPRO 2.5: Ciona intestinalis protein database, a unique integrated repository of large-scale omics data, bioinformatic analyses and curated annotation, with user rating and reviewing functionality

    Get PDF
    The Ciona intestinalis protein database (CIPRO) is an integrated protein database for the tunicate species C. intestinalis. The database is unique in two respects: first, because of its phylogenetic position, Ciona is suitable model for understanding vertebrate evolution; and second, the database includes original large-scale transcriptomic and proteomic data. Ciona intestinalis has also been a favorite of developmental biologists. Therefore, large amounts of data exist on its development and morphology, along with a recent genome sequence and gene expression data. The CIPRO database is aimed at collecting those published data as well as providing unique information from unpublished experimental data, such as 3D expression profiling, 2D-PAGE and mass spectrometry-based large-scale analyses at various developmental stages, curated annotation data and various bioinformatic data, to facilitate research in diverse areas, including developmental, comparative and evolutionary biology. For medical and evolutionary research, homologs in humans and major model organisms are intentionally included. The current database is based on a recently developed KH model containing 36 034 unique sequences, but for higher usability it covers 89 683 all known and predicted proteins from all gene models for this species. Of these sequences, more than 10 000 proteins have been manually annotated. Furthermore, to establish a community-supported protein database, these annotations are open to evaluation by users through the CIPRO website. CIPRO 2.5 is freely accessible at http://cipro.ibio.jp/2.5

    CIPRO 2.5: Ciona intestinalis Protein integrated database with large-scale omics data, bioinformatic analyses and curated annotation, with ability for user rating and comments

    Get PDF
    CIPRO database is an integrated protein database for a tunicate species Ciona intestinalis that belongs to the Urochordata. Although the CIPRO database deals with proteomic and transcriptomic data of a single species, the animal is considered unique in the evolutionary tree, representing a possible origin of the vertebrates and is a good model for understanding chordate evolution, including that of humans. Furthermore, C. intestinalis has been one of the favorites of developmental biologists; there exists a huge amount of accumulated knowledge on its development and morphology, in addition to the recent genome sequence and gene expression data. The CIPRO database is aimed at not only collecting published data, but also presenting unique information, including the unpublished transcriptomic and proteomic data and human curated annotation, for the use by researchers in broad research fields of biology and bioinformatics

    reconstruction

    No full text
    doi:10.1093/nar/gkm321 KAAS: an automatic genome annotation and pathwa

    Computational Survey of Sequence Specificity for Protein Terminal Tags Covering Nine Organisms and Its Application to Protein Identification

    No full text
    In 1998, Wilkins et al. (<i>J. Mol. Biol.</i> 1998, <i>278</i>, 599–608) reported high specificity in terminal regions (terminal tags) of 15 519 proteins from five organisms and proposed a methodology for identifying proteins by terminal tags. However, their examined sequence data were not based on complete genome sequences. Here, we examined current proteome data (217 249 entries from UniProt 2013_6 complete/reference proteome for nine organisms including human) in terms of the specificity of terminal tags and their computational annotation. One example from the results indicated that the specificity of N-terminal tags plateaued at 28% at a length of six residues for human; even when using both N- and C-terminal tags, specificity was merely 66%. In order to determine the cause of these low specificities, the annotation of proteins sharing terminal tags with other proteins was examined. The results suggested that a large majority were phylogenetically or functionally related, whereas nonrelated proteins sharing terminal tags made up less than 1% of human proteome data. On the basis of these findings, we constructed the terminal tag sequence database ProteinCarta (http://ms3d.jp/software/proteincarta/), which includes all terminal tags of proteomes from the nine organisms analyzed here, in order to confirm the specificity of terminal tags and to identify the parent protein

    KEGG OC: a large-scale automatic construction of taxonomy-based ortholog clusters.

    Get PDF
    The identification of orthologous genes in an increasing number of fully sequenced genomes is a challenging issue in recent genome science. Here we present KEGG OC (http://www.genome.jp/tools/oc/), a novel database of ortholog clusters (OCs). The current version of KEGG OC contains 1 176 030 OCs, obtained by clustering 8 357 175 genes in 2112 complete genomes (153 eukaryotes, 1830 bacteria and 129 archaea). The OCs were constructed by applying the quasi-clique-based clustering method to all possible protein coding genes in all complete genomes, based on their amino acid sequence similarities. It is computationally efficient to calculate OCs, which enables to regularly update the contents. KEGG OC has the following two features: (i) It consists of all complete genomes of a wide variety of organisms from three domains of life, and the number of organisms is the largest among the existing databases; and (ii) It is compatible with the KEGG database by sharing the same sets of genes and identifiers, which leads to seamless integration of OCs with useful components in KEGG such as biological pathways, pathway modules, functional hierarchy, diseases and drugs. The KEGG OC resources are accessible via OC Viewer that provides an interactive visualization of OCs at different taxonomic levels
    corecore