1,218 research outputs found

    In-silico prediction of disorder content using hybrid sequence representation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Intrinsically disordered proteins play important roles in various cellular activities and their prevalence was implicated in a number of human diseases. The knowledge of the content of the intrinsic disorder in proteins is useful for a variety of studies including estimation of the abundance of disorder in protein families, classes, and complete proteomes, and for the analysis of disorder-related protein functions. The above investigations currently utilize the disorder content derived from the per-residue disorder predictions. We show that these predictions may over-or under-predict the overall amount of disorder, which motivates development of novel tools for direct and accurate sequence-based prediction of the disorder content.</p> <p>Results</p> <p>We hypothesize that sequence-level aggregation of input information may provide more accurate content prediction when compared with the content extracted from the local window-based residue-level disorder predictors. We propose a novel predictor, DisCon, that takes advantage of a small set of 29 custom-designed descriptors that aggregate and hybridize information concerning sequence, evolutionary profiles, and predicted secondary structure, solvent accessibility, flexibility, and annotation of globular domains. Using these descriptors and a ridge regression model, DisCon predicts the content with low, 0.05, mean squared error and high, 0.68, Pearson correlation. This is a statistically significant improvement over the content computed from outputs of ten modern disorder predictors on a test dataset with proteins that share low sequence identity with the training sequences. The proposed predictive model is analyzed to discuss factors related to the prediction of the disorder content.</p> <p>Conclusions</p> <p>DisCon is a high-quality alternative for high-throughput annotation of the disorder content. We also empirically demonstrate that the DisCon's predictions can be used to improve binary annotations of the disordered residues from the real-value disorder propensities generated by current residue-level disorder predictors. The web server that implements the DisCon is available at <url>http://biomine.ece.ualberta.ca/DisCon/</url>.</p

    CSpritz: accurate prediction of protein disorder segments with annotation for homology, secondary structure and linear motifs

    Get PDF
    CSpritz is a web server for the prediction of intrinsic protein disorder. It is a combination of previous Spritz with two novel orthogonal systems developed by our group (Punch and ESpritz). Punch is based on sequence and structural templates trained with support vector machines. ESpritz is an efficient single sequence method based on bidirectional recursive neural networks. Spritz was extended to filter predictions based on structural homologues. After extensive testing, predictions are combined by averaging their probabilities. The CSpritz website can elaborate single or multiple predictions for either short or long disorder. The server provides a global output page, for download and simultaneous statistics of all predictions. Links are provided to each individual protein where the amino acid sequence and disorder prediction are displayed along with statistics for the individual protein. As a novel feature, CSpritz provides information about structural homologues as well as secondary structure and short functional linear motifs in each disordered segment. Benchmarking was performed on the very recent CASP9 data, where CSpritz would have ranked consistently well with a Sw measure of 49.27 and AUC of 0.828. The server, together with help and methods pages including examples, are freely available at URL: http://protein.bio.unipd.it/cspritz/

    DisPredict: A Predictor of Disordered Protein Using Optimized RBF Kernel

    Get PDF
    Intrinsically disordered proteins or, regions perform important biological functions through their dynamic conformations during binding. Thus accurate identification of these disordered regions have significant implications in proper annotation of function, induced fold prediction and drug design to combat critical diseases. We introduce DisPredict, a disorder predictor that employs a single support vector machine with RBF kernel and novel features for reliable characterization of protein structure. DisPredict yields effective performance. In addition to 10-fold cross validation, training and testing of DisPredict was conducted with independent test datasets. The results were consistent with both the training and test error minimal. The use of multiple data sources, makes the predictor generic. The datasets used in developing the model include disordered regions of various length which are categorized as short and long having different compositions, different types of disorder, ranging from fully to partially disordered regions as well as completely ordered regions. Through comparison with other state of the art approaches and case studies, DisPredict is found to be a useful tool with competitive performance. DisPredict is available at https://github.com/tamjidul/DisPredict_v1.0

    Unsupervised Integration of Multiple Protein Disorder Predictors: The Method and Evaluation on CASP7, CASP8 and CASP9 Data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Studies of intrinsically disordered proteins that lack a stable tertiary structure but still have important biological functions critically rely on computational methods that predict this property based on sequence information. Although a number of fairly successful models for prediction of protein disorder have been developed over the last decade, the quality of their predictions is limited by available cases of confirmed disorders.</p> <p>Results</p> <p>To more reliably estimate protein disorder from protein sequences, an iterative algorithm is proposed that integrates predictions of multiple disorder models without relying on any protein sequences with confirmed disorder annotation. The iterative method alternately provides the maximum a posterior (MAP) estimation of disorder prediction and the maximum-likelihood (ML) estimation of quality of multiple disorder predictors. Experiments on data used at CASP7, CASP8, and CASP9 have shown the effectiveness of the proposed algorithm.</p> <p>Conclusions</p> <p>The proposed algorithm can potentially be used to predict protein disorder and provide helpful suggestions on choosing suitable disorder predictors for unknown protein sequences.</p

    MoRFpred, a computational tool for sequence-based prediction and characterization of short disorder-to-order transitioning binding regions in proteins

    Get PDF
    Motivation: Molecular recognition features (MoRFs) are short binding regions located within longer intrinsically disordered regions that bind to protein partners via disorder-to-order transitions. MoRFs are implicated in important processes including signaling and regulation. However, only a limited number of experimentally validated MoRFs is known, which motivates development of computational methods that predict MoRFs from protein chains

    Performance of Protein Disorder Prediction Programs on Amino Acid Substitutions

    Get PDF
    Many proteins contain intrinsically disordered regions, which may be crucial for function, but on the other hand be related to the pathogenicity of variants. Prediction programs have been developed to detect disordered regions from sequences and used to predict the consequences of variants, although their performance for this task has not been assessed. We tested the performance of protein disorder prediction programs in detecting changes to disorder caused by amino acid substitutions. We assessed the performance of 29 protein disorder predictors and versions with 101 amino acid substitutions, whose effects have been experimentally validated. Disorder predictors detected the true positives at most with 6% success rate and true negatives with 34% rate for variants. The corresponding rates for the wild-type forms are 7% and 90%, respectively. The analysis revealed that disorder programs cannot reliably predict the effects of substitutions; consequently, the tested methods, and possibly similar programs, cannot be recommended for variant analysis without other information indicating to the relevance of disorder. These results inspired us to develop a new method, PON-Diso (http://structure.bmc.lu.se/PON-Diso), for disorder-related amino acid substitutions. With 50% success rate for independent test set and 70.5% rate in cross-validation, it outperforms the evaluated methods

    DBC1/CCAR2 and CCAR1 Are Largely Disordered Proteins that Have Evolved from One Common Ancestor

    Get PDF

    A creature with a hundred waggly tails: intrinsically disordered proteins in the ribosome

    Get PDF
    This article is made available for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.Intrinsic disorder (i.e., lack of a unique 3-D structure) is a common phenomenon, and many biologically active proteins are disordered as a whole, or contain long disordered regions. These intrinsically disordered proteins/regions constitute a significant part of all proteomes, and their functional repertoire is complementary to functions of ordered proteins. In fact, intrinsic disorder represents an important driving force for many specific functions. An illustrative example of such disorder-centric functional class is RNA-binding proteins. In this study, we present the results of comprehensive bioinformatics analyses of the abundance and roles of intrinsic disorder in 3,411 ribosomal proteins from 32 species. We show that many ribosomal proteins are intrinsically disordered or hybrid proteins that contain ordered and disordered domains. Predicted globular domains of many ribosomal proteins contain noticeable regions of intrinsic disorder. We also show that disorder in ribosomal proteins has different characteristics compared to other proteins that interact with RNA and DNA including overall abundance, evolutionary conservation, and involvement in protein–protein interactions. Furthermore, intrinsic disorder is not only abundant in the ribosomal proteins, but we demonstrate that it is absolutely necessary for their various functions

    Stellar and Interstellar Origins of Meteoritic Nanodiamonds

    Get PDF
    In 1987 presolar grains were first isolated from meteorites, opening up a new line of data about the stars that produced them. Based on anomalies in isotopic ratios, identification and classification of presolar grains has borne great fruit in understanding nucleosynthesis, stellar evolution, and mass loss from the stellar objects in which these grains originated: primarily, but not exclusively, supernovae and asymptotic giant branch stars.Meteoritic nanodiamonds were the first type of presolar grain identified, but more than three decades later, their origins remain unclear. Anomalies in the ratios of Xe isotopes carried by the nanodiamonds suggest the nanodiamonds formed from supernova material, but, measured in bulk, the ratios of 12C/13C and 14N/15N are consistent with formation in the solar system. Nanodiamonds are ~3 nm in diameter and contain only a few thousand atoms each, such that it is impossible to measure the isotopic ratios of single grains with traditional techniques.A multi-part experimental approach has allowed me to investigate the origins of meteoritic nanodiamonds. I use statistical studies with nanoscale secondary ion mass spectrometry of thousands of small aggregates of nanodiamonds to put upper limits on the fraction of them that can have non-solar ratios of the stable isotopes 12C and 13C and to detect isotopically anomalous statistical outliers. I also continue a collaborative work to measure the ratio of 12C/13C in individual nanodiamonds. This work adapts the experimental technique of atom-probe tomography from materials science to presolar grain research, and to that end my collaborators and I have worked extensively to mature the experimental procedures. I use focused ion beam sample preparation and correlated secondary and transmission electron microscopy to characterize samples before and after atom-probe isotopic analysis.These studies characterize the likelihood of various origins for individual and small clusters of nanodiamonds and accompanying disordered C, based on ratios of 12C/13C isotopes. The results are consistent with solar system formation for most nanodiamonds, although they do not necessarily rule out a large fraction of supernova grains with isotopic anomalies averaging close to the solar system value. The data suggest that a small subset of nanodiamonds have large isotopic enrichments in 13C relative to 12C. Supernovae are favored due to their production of the Xe isotopes, although J-star or novae could also produce this isotopic anomaly
    corecore