9 research outputs found
DisProt: intrinsic protein disorder annotation in 2020
The Database of Protein Disorder (DisProt, URL: https://disprot.org) provides manually curated annotations of intrinsically disordered proteins from the literature. Here we report recent developments with DisProt (version 8), including the doubling of protein entries, a new disorder ontology, improvements of the annotation format and a completely new website. The website includes a redesigned graphical interface, a better search engine, a clearer API for programmatic access and a new annotation interface that integrates text mining technologies. The new entry format provides a greater flexibility, simplifies maintenance and allows the capture of more information from the literature. The new disorder ontology has been formalized and made interoperable by adopting the OWL format, as well as its structure and term definitions have been improved. The new annotation interface has made the curation process faster and more effective. We recently showed that new DisProt annotations can be effectively used to train and validate disorder predictors. We believe the growth of DisProt will accelerate, contributing to the improvement of function and disorder predictors and therefore to illuminate the ‘dark’ proteome
Critical assessment of protein intrinsic disorder prediction
Abstract: Intrinsically disordered proteins, defying the traditional protein structure–function paradigm, are a challenge to study experimentally. Because a large part of our knowledge rests on computational predictions, it is crucial that their accuracy is high. The Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiment was established as a community-based blind test to determine the state of the art in prediction of intrinsically disordered regions and the subset of residues involved in binding. A total of 43 methods were evaluated on a dataset of 646 proteins from DisProt. The best methods use deep learning techniques and notably outperform physicochemical methods. The top disorder predictor has Fmax = 0.483 on the full dataset and Fmax = 0.792 following filtering out of bona fide structured regions. Disordered binding regions remain hard to predict, with Fmax = 0.231. Interestingly, computing times among methods can vary by up to four orders of magnitude
Ubiquitin Interacting Motifs:Duality Between Structured and Disordered Motifs
Ubiquitin is a small protein at the heart of many cellular processes, and several different protein domains are known to recognize and bind ubiquitin. A common motif for interaction with ubiquitin is the Ubiquitin Interacting Motif (UIM), characterized by a conserved sequence signature and often found in multi-domain proteins. Multi-domain proteins with intrinsically disordered regions mediate interactions with multiple partners, orchestrating diverse pathways. Short linear motifs for binding are often embedded in these disordered regions and play crucial roles in modulating protein function. In this work, we investigated the structural propensities of UIMs using molecular dynamics simulations and NMR chemical shifts. Despite the structural portrait depicted by X-crystallography of stable helical structures, we show that UIMs feature both helical and intrinsically disordered conformations. Our results shed light on a new class of disordered UIMs. This group is here exemplified by the C-terminal domain of one isoform of ataxin-3 and a group of ubiquitin-specific proteases. Intriguingly, UIMs not only bind ubiquitin. They can be a recruitment point for other interactors, such as parkin and the heat shock protein Hsc70-4. Disordered UIMs can provide versatility and new functions to the client proteins, opening new directions for research on their interactome
DisProt : intrinsic protein disorder annotation in 2020
Altres ajuts: European Regional Development Fund [POCI-01-0145-FEDER-031173, POCI-01-0145-FEDER-029221].- ICREA-Academia 2015The Database of Protein Disorder (DisProt, URL: https://disprot.org) provides manually curated annotations of intrinsically disordered proteins from the literature. Here we report recent developments with DisProt (version 8), including the doubling of protein entries, a new disorder ontology, improvements of the annotation format and a completely new website. The website includes a redesigned graphical interface, a better search engine, a clearer API for programmatic access and a new annotation interface that integrates text mining technologies. The new entry format provides a greater flexibility, simplifies maintenance and allows the capture of more information from the literature. The new disorder ontology has been formalized and made interoperable by adopting the OWL format, as well as its structure and term definitions have been improved. The new annotation interface has made the curation process faster and more effective. We recently showed that new DisProt annotations can be effectively used to train and validate disorder predictors. We believe the growth of DisProt will accelerate, contributing to the improvement of function and disorder predictors and therefore to illuminate the 'dark' proteome
Critical assessment of protein intrinsic disorder prediction
International audienceIntrinsically disordered proteins, defying the traditional protein structure–function paradigm, are a challenge to study experimentally. Because a large part of our knowledge rests on computational predictions, it is crucial that their accuracy is high. The Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiment was established as a community-based blind test to determine the state of the art in prediction of intrinsically disordered regions and the subset of residues involved in binding. A total of 43 methods were evaluated on a dataset of 646 proteins from DisProt. The best methods use deep learning techniques and notably outperform physicochemical methods. The top disorder predictor has F max = 0.483 on the full dataset and F max = 0.792 following filtering out of bona fide structured regions. Disordered binding regions remain hard to predict, with F max = 0.231. Interestingly, computing times among methods can vary by up to four orders of magnitude