15,441 research outputs found

    Length-dependent prediction of protein intrinsic disorder

    Get PDF
    BACKGROUND: Due to the functional importance of intrinsically disordered proteins or protein regions, prediction of intrinsic protein disorder from amino acid sequence has become an area of active research as witnessed in the 6th experiment on Critical Assessment of Techniques for Protein Structure Prediction (CASP6). Since the initial work by Romero et al. (Identifying disordered regions in proteins from amino acid sequences, IEEE Int. Conf. Neural Netw., 1997), our group has developed several predictors optimized for long disordered regions (>30 residues) with prediction accuracy exceeding 85%. However, these predictors are less successful on short disordered regions (≤30 residues). A probable cause is a length-dependent amino acid compositions and sequence properties of disordered regions. RESULTS: We proposed two new predictor models, VSL2-M1 and VSL2-M2, to address this length-dependency problem in prediction of intrinsic protein disorder. These two predictors are similar to the original VSL1 predictor used in the CASP6 experiment. In both models, two specialized predictors were first built and optimized for short (≤30 residues) and long disordered regions (>30 residues), respectively. A meta predictor was then trained to integrate the specialized predictors into the final predictor model. As the 10-fold cross-validation results showed, the VSL2 predictors achieved well-balanced prediction accuracies of 81% on both short and long disordered regions. Comparisons over the VSL2 training dataset via 10-fold cross-validation and a blind-test set of unrelated recent PDB chains indicated that VSL2 predictors were significantly more accurate than several existing predictors of intrinsic protein disorder. CONCLUSION: The VSL2 predictors are applicable to disordered regions of any length and can accurately identify the short disordered regions that are often misclassified by our previous disorder predictors. The success of the VSL2 predictors further confirmed the previously observed differences in amino acid compositions and sequence properties between short and long disordered regions, and justified our approaches for modelling short and long disordered regions separately. The VSL2 predictors are freely accessible for non-commercial use a

    D2P2: database of disordered protein predictions

    Get PDF
    We present the Database of Disordered Protein Prediction (D2P2), available at http://d2p2.pro (including website source code). A battery of disorder predictors and their variants, VL-XT, VSL2b, PrDOS, PV2, Espritz and IUPred, were run on all protein sequences from 1765 complete proteomes (to be updated as more genomes are completed). Integrated with these results are all of the predicted (mostly structured) SCOP domains using the SUPERFAMILY predictor. These disorder/structure annotations together enable comparison of the disorder predictors with each other and examination of the overlap between disordered predictions and SCOP domains on a large scale. D2P2 will increase our understanding of the interplay between disorder and structure, the genomic distribution of disorder, and its evolutionary history. The parsed data are made available in a unified format for download as flat files or SQL tables either by genome, by predictor, or for the complete set. An interactive website provides a graphical view of each protein annotated with the SCOP domains and disordered regions from all predictors overlaid (or shown as a consensus). There are statistics and tools for browsing and comparing genomes and their disorder within the context of their position on the tree of life. © The Author(s) 2012. Published by Oxford University Press

    Abundance of intrinsic disorder in SV-IV, a multifunctional androgen-dependent protein secreted from rat seminal vesicle

    Get PDF
    The potent immunomodulatory, anti-inflammatory and procoagulant properties of the
protein no. 4 secreted from the rat seminal vesicle epithelium (SV-IV) have been
previously found to be modulated by a supramolecular monomer-trimer equilibrium.
More structural details that integrate experimental data into a predictive framework
have recently been reported. Unfortunately, homology modelling and fold-recognition
strategies were not successful in creating a theoretical model of the structural
organization of SV-IV. It was inferred that the global structure of SV-IV is not similar
to any protein of known three-dimensional structure. Reversing the classical approach
to the sequence-structure-function paradigm, in this paper we report on novel
information obtained by comparing physicochemical parameters of SV-IV with two
datasets made of intrinsically unfolded and ideally globular proteins. In addition, we
have analysed the SV-IV sequence by several publicly available disorder-oriented
predictors. Overall, disorder predictions and a re-examination of existing experimental
data strongly suggest that SV-IV needs large plasticity to efficiently interact with the
different targets that characterize its multifaceted biological function and should be
therefore better classified as an intrinsically disordered protein

    Adenovirus type 5 E4 Orf3 protein targets promyelocytic leukaemia (PML) protein nuclear domains for disruption via a sequence in PML isoform II that is predicted as a protein interaction site by bioinformatic analysis

    Get PDF
    Human adenovirus type 5 infection causes the disruption of structures in the cell nucleus termed promyelocytic leukaemia (PML) protein nuclear domains or ND10, which contain the PML protein as a critical component. This disruption is achieved through the action of the viral E4 Orf3 protein, which forms track-like nuclear structures that associate with the PML protein. This association is mediated by a direct interaction of Orf3 with a specific PML isoform, PMLII. We show here that the Orf3 interaction properties of PMLII are conferred by a 40 aa residue segment of the unique C-terminal domain of the protein. This segment was sufficient to confer interaction on a heterologous protein. The analysis was informed by prior application of a bioinformatic tool for the prediction of potential protein interaction sites within unstructured protein sequences (predictors of naturally disordered region analysis; PONDR). This tool predicted three potential molecular recognition elements (MoRE) within the C-terminal domain of PMLII, one of which was found to form the core of the Orf3 interaction site, thus demonstrating the utility of this approach. The sequence of the mapped Orf3-binding site on PML protein was found to be relatively poorly conserved across other species; however, the overall organization of MoREs within unstructured sequence was retained, suggesting the potential for conservation of functional interactions

    Self-organization of intrinsically disordered proteins with folded N-termini

    Get PDF
    Thousands of human proteins lack recognizable tertiary structure in most of their chains. Here we hypothesize that some use their structured N-terminal domains (SNTDs) to organise the remaining protein chain via intramolecular interactions, generating partially structured proteins. This model has several attractive features: as protein chains emerge, SNTDs form spontaneously and serve as nucleation points, creating more compact shapes. This reduces the risk of protein degradation or aggregation. Moreover, an interspersed pattern of SNTD-docked regions and free loops can coordinate assembly of sub-complexes in defined loop-sections and enables novel regulatory mechanisms, for example through posttranslational modifications of docked regions

    Allosteric Modulators of Steroid Hormone Receptors : Structural Dynamics and Gene Regulation

    Get PDF
    Peer reviewedPublisher PD

    Casein and casein micelle structures, functions and diversity in 20 species

    Get PDF
    Primary structures of caseins from 20 species, including two monotremes and two marsupials, have been compared. Sequences of the mature proteins are very divergent, whereas variation in amino acid composition is mostly restricted to a range of disorder-promoting residues. The number and size of clusters of phosphorylation sites in the caseins is variable, blurring the boundaries between them. Casein polar tract sequences were found in all caseins, though of variable lengths, and are chiefly responsible for weak and dynamic interactions among the tangled web of peptide chains in the matrix of casein micelles. The interactions take the predominant form of backbone-to-backbone contacts rather than the sequence-specific side chain interactions of the hydrophobic effect. It is suggested that the dynamic casein micelle matrix be represented by an ensemble of interchanging structures with different types and degrees of inhomogeneity, influenced by solvent quality and other environmental factors

    The Acidic Domains of the Toc159 Chloroplast Preprotein Receptor Family are Instrinsically Disordered Protein Domains

    Get PDF
    Background: The Toc159 family of proteins serve as receptors for chloroplast-destined preproteins. They directly bind to transit peptides, and exhibit preprotein substrate selectivity conferred by an unknown mechanism. The Toc159 receptors each include three domains: C-terminal membrane, central GTPase, and N-terminal acidic (A-) domains. Although the function(s) of the A-domain remains largely unknown, the amino acid sequences are most variable within these domains, suggesting they may contribute to the functional specificity of the receptors. Results: The physicochemical properties of the A-domains are characteristic of intrinsically disordered proteins (IDPs). Using CD spectroscopy we show that the A-domains of two Arabidopsis Toc159 family members (atToc132 and atToc159) are disordered at physiological pH and temperature and undergo conformational changes at temperature and pH extremes that are characteristic of IDPs. Conclusions: Identification of the A-domains as IDPs will be important for determining their precise function(s), and suggests a role in protein-protein interactions, which may explain how these proteins serve as receptors for such a wide variety of preprotein substrates

    Buried and accessible surface area control intrinsic protein flexibility

    Get PDF
    Proteins experience a wide variety of conformational dynamics that can be crucial for facilitating their diverse functions. How is the intrinsic flexibility required for these motions encoded in their three-dimensional structures? Here, the overall flexibility of a protein is demonstrated to be tightly coupled to the total amount of surface area buried within its fold. A simple proxy for this, the relative solvent accessible surface area (Arel), therefore shows excellent agreement with independent measures of global protein flexibility derived from various experimental and computational methods. Application of Arel on a large scale demonstrates its utility by revealing unique sequence and structural properties associated with intrinsic flexibility. In particular, flexibility as measured by Arel shows little correspondence with intrinsic disorder, but instead tends to be associated with multiple domains and increased {\alpha}- helical structure. Furthermore, the apparent flexibility of monomeric proteins is found to be useful for identifying quaternary structure errors in published crystal structures. There is also a strong tendency for the crystal structures of more flexible proteins to be solved to lower resolutions. Finally, local solvent accessibility is shown to be a primary determinant of local residue flexibility. Overall this work provides both fundamental mechanistic insight into the origin of protein flexibility and a simple, practical method for predicting flexibility from protein structures.Comment: 36 pages, 11 figures, author's manuscript, accepted for publication in Journal of Molecular Biolog
    corecore