21 research outputs found

    Synonymous Constraint Elements Show a Tendency to Encode Intrinsically Disordered Protein Segments

    Get PDF
    Synonymous constraint elements (SCEs) are protein-coding genomic regions with very low synonymous mutation rates believed to carry additional, overlapping functions. Thousands of such potentially multi-functional elements were recently discovered by analyzing the levels and patterns of evolutionary conservation in human coding exons. These elements provide a good opportunity to improve our understanding of how the redundant nature of the genetic code is exploited in the cell. Our premise is that the protein segments encoded by such elements might better comply with the increased functional demands if they are structurally less constrained (i.e. intrinsically disordered). To test this idea, we investigated the protein segments encoded by SCEs with computational tools to describe the underlying structural properties. In addition to SCEs, we examined the level of disorder, secondary structure, and sequence complexity of protein regions overlapping with experimentally validated splice regulatory sites. We show that multi-functional gene regions translate into protein segments that are significantly enriched in structural disorder and compositional bias, while they are depleted in secondary structure and domain annotations compared to reference segments of similar lengths. This tendency suggests that relaxed protein structural constraints provide an advantage when accommodating multiple overlapping functions in coding regions. © 2014 Macossay-Castillo et al

    Computational analysis of translational readthrough proteins in Drosophila and yeast reveals parallels to alternative splicing

    Get PDF
    In translational readthrough (TR) the ribosome continues extending the nascent protein beyond the first in-frame termination codon. Due to the lack of dedicated analyses of eukaryotic TR cases, the associated functional-evolutionary advantages are still unclear. Here, based on a variety of computational methods, we describe the structural and functional properties of previously proposed D. melanogaster and S. cerevisiae TR proteins and extensions. We found that in D. melanogaster TR affects long proteins in mainly regulatory roles. Their TR-extensions are structurally disordered and rich in binding motifs, which, together with their cell-type-and developmental stage-dependent inclusion, suggest that similarly to alternatively spliced exons they rewire cellular interaction networks in a temporally and spatially controlled manner. In contrast, yeast TR proteins are rather short and fulfil mainly housekeeping functions, like translation. Yeast extensions usually lack disorder and linear motifs, which precludes elucidating their functional relevance with sufficient confidence. Therefore we propose that by being much more restricted and by lacking clear functional hallmarks in yeast as opposed to fruit fly, TR shows remarkable parallels with alternative splicing. Additionally, the lack of conservation of TR extensions among orthologous TR proteins suggests that TR-mediated functions may be generally specific to lower taxonomic levels. © The Author(s) 2016

    The Balancing Act of Intrinsically Disordered Proteins

    Get PDF
    Intrinsically disordered proteins (IDPs) or regions (IDRs) perform diverse cellular functions, but are also prone to forming promiscuous and potentially deleterious interactions. We investigate the extent to which the properties of, and content in, IDRs have adapted to enable functional diversity while limiting interference from promiscuous interactions in the crowded cellular environment. Information on protein sequences, their predicted intrinsic disorder, and 3D structure contents is related to data on protein cellular concentrations, gene co-expression, and protein-protein interactions in the well-studied yeast Saccharomyces cerevisiae. Results reveal that both the protein IDR content and the frequency of "sticky" amino acids in IDRs (those more frequently involved in protein interfaces) decrease with increasing protein cellular concentration. This implies that the IDR content and the amino acid composition of IDRs experience negative selection as the protein concentration increases. In the S. cerevisiae protein-protein interaction network, the higher a protein's IDR content, the more frequently it interacts with IDR-containing partners, and the more functionally diverse the partners are. Employing a clustering analysis of Gene Ontology terms, we newly identify ~600 putative multifunctional proteins in S. cerevisiae. Strikingly, these proteins are enriched in IDRs and contribute significantly to all the observed trends. In particular, IDRs of multi-functional proteins feature more sticky amino acids than IDRs of their non-multifunctional counterparts, or the surfaces of structured yeast proteins. This property likely affords sufficient binding affinity for the functional interactions, commonly mediated by short IDR segments, thereby counterbalancing the loss in overall IDR conformational entropy upon binding

    Quantification of Intrinsically Disordered Proteins: A Problem Not Fully Appreciated

    Get PDF
    Protein quantification is essential in a great variety of biochemical assays, yet the inherent systematic errors associated with the concentration determination of intrinsically disordered proteins (IDPs) using classical methods are hardly appreciated. Routinely used assays for protein quantification, such as the Bradford assay or ultraviolet absorbance at 280 nm, usually seriously misestimate the concentrations of IDPs due to their distinct and variable amino acid composition. Therefore, dependable method(s) have to be worked out/adopted for this task. By comparison to elemental analysis as the gold standard, we show through the example of four globular proteins and nine IDPs that the ninhydrin assay and the commercial Qubit(TM) Protein Assay provide reliable data on IDP quantity. However, as IDPs can show extreme variation in amino acid composition and physical features not necessarily covered by our examples, even these techniques should only be used for IDPs following standardization. The far-reaching implications of these simple observations are demonstrated through two examples: (i) circular dichroism spectrum deconvolution, and (ii) receptor-ligand affinity determination. These actual comparative examples illustrate the potential errors that can be incorporated into the biophysical parameters of IDPs, due to systematic misestimation of their concentration. This leads to inaccurate description of IDP functions

    Quantification of Intrinsically Disordered Proteins: A Problem Not Fully Appreciated

    Get PDF
    Protein quantification is essential in a great variety of biochemical assays, yet the inherent systematic errors associated with the concentration determination of intrinsically disordered proteins (IDPs) using classical methods are hardly appreciated. Routinely used assays for protein quantification, such as the Bradford assay or ultraviolet absorbance at 280 nm, usually seriously misestimate the concentrations of IDPs due to their distinct and variable amino acid composition. Therefore, dependable method(s) have to be worked out/adopted for this task. By comparison to elemental analysis as the gold standard, we show through the example of four globular proteins and nine IDPs that the ninhydrin assay and the commercial QubitTM Protein Assay provide reliable data on IDP quantity. However, as IDPs can show extreme variation in amino acid composition and physical features not necessarily covered by our examples, even these techniques should only be used for IDPs following standardization. The far-reaching implications of these simple observations are demonstrated through two examples: (i) circular dichroism spectrum deconvolution, and (ii) receptor-ligand affinity determination. These actual comparative examples illustrate the potential errors that can be incorporated into the biophysical parameters of IDPs, due to systematic misestimation of their concentration. This leads to inaccurate description of IDP functions

    DisProt: intrinsic protein disorder annotation in 2020

    Get PDF
    The Database of Protein Disorder (DisProt, URL: https://disprot.org) provides manually curated annotations of intrinsically disordered proteins from the literature. Here we report recent developments with DisProt (version 8), including the doubling of protein entries, a new disorder ontology, improvements of the annotation format and a completely new website. The website includes a redesigned graphical interface, a better search engine, a clearer API for programmatic access and a new annotation interface that integrates text mining technologies. The new entry format provides a greater flexibility, simplifies maintenance and allows the capture of more information from the literature. The new disorder ontology has been formalized and made interoperable by adopting the OWL format, as well as its structure and term definitions have been improved. The new annotation interface has made the curation process faster and more effective. We recently showed that new DisProt annotations can be effectively used to train and validate disorder predictors. We believe the growth of DisProt will accelerate, contributing to the improvement of function and disorder predictors and therefore to illuminate the ‘dark’ proteome

    Caractérisation des périodes de sécheresse sur le domaine de l'Afrique simulée par le Modèle Régional Canadien du Climat (MRCC5)

    Get PDF
    Les conséquences des changements climatiques sur la fréquence ainsi que sur l'intensité des précipitations auront un impact direct sur les périodes de sécheresse et par conséquent sur différents secteurs économiques tels que le secteur de l'agriculture. Ainsi, dans cette étude, l'habilité du Modèle Régional Canadien du Climat (MRCC5) à simuler les différentes caractéristiques des périodes de sécheresse est évaluée pour 4 seuils de précipitation soit 0.5 mm, 1 mm, 2 mm et 3 mm. Ces caractéristiques incluent le nombre de jours secs, le nombre de périodes de sécheresse ainsi que le maximum de jours consécutifs sans précipitation associé à une récurrence de 5 ans. Les résultats sont présentés pour des moyennes annuelles et saisonnières. L'erreur de performance est évaluée en comparant le MRCC5 piloté par ERA-Interim aux données d'analyses du GPCP pour le climat présent (1997-2008). L'erreur due aux conditions aux frontières c'est-à-dire les erreurs de pilotage du MRCC5, soit par CanESM2 et par ERA-Interim ainsi que l'évaluation de la valeur ajoutée du MRCC5 face au CanESM2 sont également analysées. L'analyse de ces caractéristiques est également faite dans un contexte de climat changeant pour deux périodes futures, soit 2041-2070 et 2071-2100 à l'aide du MRCC5 piloté par le modèle de circulation générale CanESM2 de même que par le modèle CanESM2 sous le scénario RCP 4.5. Les résultats suggèrent que le MRCC5 piloté par ERA-Interim a tendance à surestimer la moyenne annuelle du nombre de jours secs ainsi que le maximum de jours consécutifs sans précipitation associé à une récurrence de 5 ans dans la plupart des régions de l'Afrique et une tendance à sous-estimer le nombre de périodes de sécheresse. En général, l'erreur de performance est plus importante que l'erreur due aux conditions aux frontières pour les différentes caractéristiques de périodes de sécheresse. Pour les régions équatoriales, les changements appréhendés par le MRCC5 piloté par CanESM2 pour les différentes caractéristiques de périodes de sécheresse et pour deux périodes futures (2041-2070 et 2071-2100), suggèrent une augmentation significatives du nombre de jours secs ainsi que du maximum de jours consécutifs sans précipitation associé à une récurrence de 5 ans. Une diminution significative du nombre de périodes de sécheresse est aussi prévue.\ud ______________________________________________________________________________ \ud MOTS-CLÉS DE L’AUTEUR : Modèle Régional du Climat, Changement climatique, Jours secs, Nombre de périodes de sécheresse, Événement de faible récurrence, Afriqu

    Critical assessment of protein intrinsic disorder prediction

    Get PDF
    Abstract: Intrinsically disordered proteins, defying the traditional protein structure–function paradigm, are a challenge to study experimentally. Because a large part of our knowledge rests on computational predictions, it is crucial that their accuracy is high. The Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiment was established as a community-based blind test to determine the state of the art in prediction of intrinsically disordered regions and the subset of residues involved in binding. A total of 43 methods were evaluated on a dataset of 646 proteins from DisProt. The best methods use deep learning techniques and notably outperform physicochemical methods. The top disorder predictor has Fmax = 0.483 on the full dataset and Fmax = 0.792 following filtering out of bona fide structured regions. Disordered binding regions remain hard to predict, with Fmax = 0.231. Interestingly, computing times among methods can vary by up to four orders of magnitude

    The alternative translation start site within BRCA1 translates into a mostly disordered protein segment.

    No full text
    <p>The CDS fragment corresponding to residues 275–310 in the canonical BRCA1 isoform is presented in a light blue box at the top, with a validated alternative translation start site (ATSS) highlighted in yellow. The domain map of the canonical isoform is shown below the CDS with the domains coloured purple (residue boundaries assigned based on the UniProtKB) and the region surrounding the mentioned ATSS marked by darker grey. The protein segment in question is enlarged from the domain map and the identified SCEs and predicted structural properties are indicated below by dark blue bars, as explained for <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003607#pcbi-1003607-g003" target="_blank">Figure 3</a>.</p

    DNA-level secondary functions in coding regions: The case of the <i>HOXA2</i> gene.

    No full text
    <p>The homeobox protein Hox-A2 is represented by a light grey bar, with its sole domain (homeobox) and antp-type motif colored purple (residue boundaries assigned based on the UniProtKB annotation) and its SCE-overlapping N-terminal region marked by dark grey. The CDS corresponding to this segment is shown above the domain map in a light blue box with the region of multi-functionality (a HOX-PBX responsive element) highlighted in yellow. The corresponding peptide sequence is presented in a purple box with the precise locations of detected SCEs, predicted disordered regions, low sequence complexity segments and secondary structure elements (H – helix, E – extended) represented as dark blue bars below the protein sequence. B) The enhancer-rich region corresponding to residues 261–313 of the same Hox protein is presented in a similar fashion as in panel A.</p
    corecore