43 research outputs found

    MisPred: a resource for identification of erroneous protein sequences in public databases

    Get PDF
    Correct prediction of the structure of protein-coding genes of higher eukaryotes is still a difficult task; therefore, public databases are heavily contaminated with mispredicted sequences. The high rate of misprediction has serious consequences because it significantly affects the conclusions that may be drawn from genome-scale sequence analyses of eukaryotic genomes. Here we present the MisPred database and computational pipeline that provide efficient means for the identification of erroneous sequences in public databases. The MisPred database contains a collection of abnormal, incomplete and mispredicted protein sequences from 19 metazoan species identified as erroneous by MisPred quality control tools in the UniProtKB/Swiss-Prot, UniProtKB/TrEMBL, NCBI/RefSeq and EnsEMBL databases. Major releases of the database are automatically generated and updated regularly. The database (http://www.mispred.com) is easily accessible through a simple web interface coupled to a powerful query engine and a standard web service. The content is completely or partially downloadable in a variety of formats

    FixPred: a resource for correction of erroneous protein sequences.

    Get PDF
    Protein databases are heavily contaminated with erroneous (mispredicted, abnormal and incomplete) sequences and these erroneous data significantly distort the conclusions drawn from genome-scale protein sequence analyses. In our earlier work we described the MisPred resource that serves to identify erroneous sequences; here we present the FixPred computational pipeline that automatically corrects sequences identified by MisPred as erroneous. The current version of the associated FixPred database contains corrected UniProtKB/Swiss-Prot and NCBI/RefSeq sequences from Homo sapiens, Mus musculus, Rattus norvegicus, Monodelphis domestica, Gallus gallus, Xenopus tropicalis, Danio rerio, Fugu rubripes, Ciona intestinalis, Branchostoma floridae, Drosophila melanogaster and Caenorhabditis elegans; future releases of the FixPred database will include corrected sequences of additional Metazoan species. The FixPred computational pipeline and database (http://www.fixpred.com) are easily accessible through a simple web interface coupled to a powerful query engine and a standard web service. The content is completely or partially downloadable in a variety of formats. Database URL: http://www.fixpred.com

    K153R polymorphism in myostatin gene increases the rate of promyostatin activation by furin

    Get PDF
    Recent studies demonstrated an association between the K153R polymorphism in the myostatin gene with extreme longevity, lower muscle strength and obesity but the molecular basis of these associations has not been clarified. Here, we show that the K153R mutation significantly increases the rate of proteolysis of promyostatin by furin, but has no effect on the activity of the latent complex or the cleavage of the latent complex by bone morphogenetic protein 1 (BMP-1). The increased rate of activation of K153R mutant promyostatin may explain why this polymorphism is associated with obesity, lower muscle strength and extension of lifespan

    Exon skipping-rich transcriptomes of animals reflect the significance of exon-shuffling in metazoan proteome evolution

    Get PDF
    Animals are known to have higher rates of exon skipping than other eukaryotes. In a recent study, Grau-Bove et al. (Genome Biology 19:135, 2018) have used RNA-seq data across 65 eukaryotic species to investigate when and how this high prevalence of exon skipping evolved. They have found that bilaterian Metazoa have significantly increased exon skipping frequencies compared to all other eukaryotic groups and that exon skipping in nearly all animals, including non-bilaterians, is strongly enriched for frame-preserving events. The authors have hypothesized that the increase of exon skipping rates in animals followed a two-step process. First, exon skipping in early animals became enriched for frame-preserving events. Second, bilaterian ancestors dramatically increased their exon skipping frequencies, likely driven by the interplay between a shift in their genome architectures towards more exon definition and recruitment of frame-preserving exon skipping events to functionally diversify their cell-specific proteomes.Here we offer a different explanation for the higher frequency of frame-preserving exon skipping in Metzoa than in all other eukaryotes. In our view these observations reflect the fact that the majority of multidomain proteins unique to metazoa and indispensable for metazoan type multicellularity were assembled by exon-shuffling from symmetrical' modules (i.e. modules flanked by introns of the same phase), whereas this type of protein evolution played a minor role in other groups of eukaryotes, including plants. The higher frequency of symmetrical' exons in Metazoan genomes provides an explanation for the enrichment for frame-preserving events since skipping or inclusion of symmetrical' modules during alternative splicing does not result in a reading-frame shift

    A humán agyi tripszin biológiai funkciójának felderítése: új stratégia = Searching for the biological function of human brain trypsin: new strategy

    Get PDF
    A primata specifikus szerin proteázt, a humán tripszin 4-et munkacsoportunk klónozta és expresszáltatta először heterológ rendszerben. Az aktív enzim kristályszerkezetét is mi határoztuk meg először. A humán tripszin 4 biológiai funkciójának felderítése volt a jelen pályázat célja. Tekintettel a fehérje primátákban való előfordulására a funkciót nem elsősorban az élettan, hanem a modern molekuláris biológia, enzimológia és sejtbiológia eszközeivel kutattuk. Végleges felderítésével ugyan adósak maradtunk, az elmúlt négy év kutatásai számos, a humán proteáz funkciójának tisztázásához támpontot nyújtó felfedezéshez vezettek. Ezek a jövetkezők: 1) Megállapítottuk, hogy a primata-specifikus tripszin 4 egyik, feltehetően biológiai szubsztrátja a mielin bázikus fehérje. 2) Post mortem emberi agy mintákból tripszinogén 4 B-izoformát izoláltunk és megállapítottuk, hogy a fehérje transzlációja egy CUG triplett által kódolt iniciátor leucinnal indul. Feltételeztük, hogy ez a mechanizmus a gén expresszió szabályozásának eszköze. 3) Felderítettük a humán tripszinogén 4 asztroglia sejten belüli transzportjának útját és aktivációjának lehetséges helyét. | For the first time human trypsin 4 was cloned and expressed in a heterologous system by our research team. The first crystal structure of this protease was also reported by our group. Exploration of the biological function of human trypsin 4 was the goal of our present grant proposal. Considering that this enzyme only occurs in Primates the biological function of trypsin 4 was studied by the means of modern molecular biology, enzymology and cell biology, rather than by those of physiology. Though we cannot unambiguosly define the biological function(s) of this protease yet, our last 4-year reserach led to several discoveries, which may provide a good basis for further exploring the physiological or pathological functions of human trypsin 4. These discoveries are as follows: 1) We provided indirect evidence that myelin basic protein (MBP) might be one of the biological substrates of human trypsin 4. 2) From samples of post mortem human brain for the first time we isolated and characterized Isoform B of trypsinogen 4 and established that the translation of human trypsinogen 4 can be initiated at a CUG codon with an N-terminal leucine residue. We proposed that this unconventional translation initiation may be a new mechanism to regulate gene expression. 3) The transport of human trypsinogen 4 was explored in astroglia cells, and the possible intracellular site of its activation was determined

    Morphological Stasis and Proteome Innovation in Cephalochordates

    Get PDF
    Lancelets, extant representatives of basal chordates, are prototypic examples of evolutionary stasis; they preserved a morphology and body-plan most similar to the fossil chordates from the early Cambrian. Such a low level of morphological evolution is in harmony with a low rate of amino acid substitution; cephalochordate proteins were shown to evolve slower than those of the slowest evolving vertebrate, the elephant shark. Surprisingly, a study comparing the predicted proteomes of Chinese amphioxus, Branchiostoma belcheri and the Florida amphioxus, Branchiostoma floridae has led to the conclusion that the rate of creation of novel domain combinations is orders of magnitude greater in lancelets than in any other Metazoa, a finding that contradicts the notion that high rates of protein innovation are usually associated with major evolutionary innovations. Our earlier studies on a representative sample of proteins have provided evidence suggesting that the differences in the domain architectures of predicted proteins of these two lancelet species reflect annotation errors, rather than true innovations. In the present work, we have extended these studies to include a larger sample of genes and two additional lancelet species, Asymmetron lucayanum and Branchiostoma lanceolatum. These analyses have confirmed that the domain architecture differences of orthologous proteins of the four lancelet species are because of errors of gene prediction, the error rate in the given species being inversely related to the quality of the transcriptome dataset that was used to aid gene prediction

    Identification and correction of abnormal, incomplete and mispredicted proteins in public databases

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Despite significant improvements in computational annotation of genomes, sequences of abnormal, incomplete or incorrectly predicted genes and proteins remain abundant in public databases. Since the majority of incomplete, abnormal or mispredicted entries are not annotated as such, these errors seriously affect the reliability of these databases. Here we describe the MisPred approach that may provide an efficient means for the quality control of databases. The current version of the MisPred approach uses five distinct routines for identifying abnormal, incomplete or mispredicted entries based on the principle that a sequence is likely to be incorrect if some of its features conflict with our current knowledge about protein-coding genes and proteins: (i) conflict between the predicted subcellular localization of proteins and the absence of the corresponding sequence signals; (ii) presence of extracellular and cytoplasmic domains and the absence of transmembrane segments; (iii) co-occurrence of extracellular and nuclear domains; (iv) violation of domain integrity; (v) chimeras encoded by two or more genes located on different chromosomes.</p> <p>Results</p> <p>Analyses of predicted EnsEMBL protein sequences of nine deuterostome (<it>Homo sapiens, Mus musculus, Rattus norvegicus, Monodelphis domestica, Gallus gallus, Xenopus tropicalis, Fugu rubripes, Danio rerio </it>and <it>Ciona intestinalis</it>) and two protostome species (<it>Caenorhabditis elegans </it>and <it>Drosophila melanogaster</it>) have revealed that the absence of expected signal peptides and violation of domain integrity account for the majority of mispredictions. Analyses of sequences predicted by NCBI's GNOMON annotation pipeline show that the rates of mispredictions are comparable to those of EnsEMBL. Interestingly, even the manually curated UniProtKB/Swiss-Prot dataset is contaminated with mispredicted or abnormal proteins, although to a much lesser extent than UniProtKB/TrEMBL or the EnsEMBL or GNOMON-predicted entries.</p> <p>Conclusion</p> <p>MisPred works efficiently in identifying errors in predictions generated by the most reliable gene prediction tools such as the EnsEMBL and NCBI's GNOMON pipelines and also guides the correction of errors. We suggest that application of the MisPred approach will significantly improve the quality of gene predictions and the associated databases.</p

    Wnt fehérjék és Wnt receptorok = Interaction of Wnt proteins with receptors and antagonists

    Get PDF
    Arginine-scanning mutagenezis és CD spektroszkópia segítségével meghatároztuk a Wnt inhibitory factor-1 WIF-doménjének Wnt-kötőhelyét. Kimutattuk, hogy a Wnt-kötőhely egyik szubrégiója (melynek kialakításában a Ile25, Phe27 és Phe42 aminosavak vesznek részt) a Wnt fehérjék palmitoil csoportját köti, a kötőhely másik szubrégiója (melynek kialakításában a Tyr13, Trp15 és Leu32 aminosavak vesznek részt) a Wnt-k aminosav-oldalláncainak kötésében játszik fontos szerepet. Az a megfigyelésünk, hogy a Tyr13, Trp15 és Leu32 mutációja a WIF-domén Wnt5a affinitásának növekedéséhez, ugyanakkor a Wnt3a affinitásának csökkenéséhez vezet, azt jelzi, hogy ez a szubrégió fontos szerepet játszik a domén Wnt specifitásának meghatározásában. A Wnt specifitást meghatározó aminosavak azonosításával olyan WIF mutánsokat állíthatunk elő, melyekkel lehetőség nyílik a daganatos betegségekben kulcsszerepet játszó Wnt fehérjék szelektív gátlására. | We have localized the Wnt-binding site of the WIF-domain of Wnt inhibitory factor-1 by structure-guided arginine-scanning mutagenesis in combination with surface plasmon resonance assays. We have shown that the subregion of the Wnt-binding site defined by Ile25, Phe27 and Phe42 may bind the palmitoyl group of Wnt-s, the other subregion defined by residues Tyr13, Trp15 and Leu32, however, is critical for interactions with amino acid side-chains of Wnts. Our observation that substitution of these residues of WIF resulted in an increased affinity for Wnt5a, but decreased affinity for Wnt3a suggests that these residues may define the specificity spectrum of WIF for Wnts. These results hold promise for the more specific targeting of Wnt family members with WIF variants in various forms of cancer
    corecore