178 research outputs found

    Improving protein order-disorder classification using charge-hydropathy plots

    Get PDF
    BACKGROUND: The earliest whole protein order/disorder predictor (Uversky et al., Proteins, 41: 415-427 (2000)), herein called the charge-hydropathy (C-H) plot, was originally developed using the Kyte-Doolittle (1982) hydropathy scale (Kyte & Doolittle., J. Mol. Biol, 157: 105-132(1982)). Here the goal is to determine whether the performance of the C-H plot in separating structured and disordered proteins can be improved by using an alternative hydropathy scale. RESULTS: Using the performance of the CH-plot as the metric, we compared 19 alternative hydropathy scales, with the finding that the Guy (1985) hydropathy scale (Guy, Biophys. J, 47:61-70(1985)) was the best of the tested hydropathy scales for separating large collections structured proteins and intrinsically disordered proteins (IDPs) on the C-H plot. Next, we developed a new scale, named IDP-Hydropathy, which further improves the discrimination between structured proteins and IDPs. Applying the C-H plot to a dataset containing 109 IDPs and 563 non-homologous fully structured proteins, the Kyte-Doolittle (1982) hydropathy scale, the Guy (1985) hydropathy scale, and the IDP-Hydropathy scale gave balanced two-state classification accuracies of 79%, 84%, and 90%, respectively, indicating a very substantial overall improvement is obtained by using different hydropathy scales. A correlation study shows that IDP-Hydropathy is strongly correlated with other hydropathy scales, thus suggesting that IDP-Hydropathy probably has only minor contributions from amino acid properties other than hydropathy. CONCLUSION: We suggest that IDP-Hydropathy would likely be the best scale to use for any type of algorithm developed to predict protein disorder

    Letter to the Editor: a response to Horne and Lucey (2017)

    Get PDF
    No abstract available

    Potential functions of LEA proteins from the brine shrimp Artemia franciscana - Anhydrobiosis meets bioinformatics.

    Get PDF
    Late embryogenesis abundant (LEA) proteins are a large group of anhydrobiosis-associated intrinsically disordered proteins (IDP), which are commonly found in plants and some animals. The brine shrimp Artemiafranciscana is the only known animal that expresses LEA proteins from three, and not only one, different groups in its anhydrobiotic life stage. The reason for the higher complexity in the A. franciscana LEA proteome (LEAome), compared with other anhydrobiotic animals, remains mostly unknown. To address this issue, we have employed a suite of bioinformatics tools to evaluate the disorder status of the ArtemiaLEAome and to analyze the roles of intrinsic disorder in functioning of brine shrimp LEA proteins. We show here that A. franciscanaLEA proteins from different groups are more similar to each other than one originally expected, while functional differences among members of group 3 are possibly larger than commonly anticipated. Our data show that although these proteins are characterized by a large variety of forms and possible functions, as a general strategy, A. franciscana utilizes glassy matrix forming LEAs concurrently with proteins that more readily interact with binding partners. It is likely that the function(s) of both types, the matrix-forming and partner-binding LEA proteins, are regulated by changing water availability during desiccation

    The RCSB Protein Data Bank: views of structural biology for basic and applied research and education.

    Get PDF
    The RCSB Protein Data Bank (RCSB PDB, http://www.rcsb.org) provides access to 3D structures of biological macromolecules and is one of the leading resources in biology and biomedicine worldwide. Our efforts over the past 2 years focused on enabling a deeper understanding of structural biology and providing new structural views of biology that support both basic and applied research and education. Herein, we describe recently introduced data annotations including integration with external biological resources, such as gene and drug databases, new visualization tools and improved support for the mobile web. We also describe access to data files, web services and open access software components to enable software developers to more effectively mine the PDB archive and related annotations. Our efforts are aimed at expanding the role of 3D structure in understanding biology and medicine

    A study of intrinsic disorder and its role in functional proteomics

    Get PDF
    Thesis (Ph.D.) - Indiana University, Informatics, 2009The last decade has witnessed the emergence of an alternate view on how protein function arises. This view attributes the functionality of many proteins to the presence of an ensemble of flexible regions popularly as `intrinsically disordered' or `unstructured'. Several proteomic studies have corroborated the existence of either wholly disordered proteins or proteins that contain regions of disorder in them. The purpose of this dissertation was to investigate the consistency of such regions across experiments, their mechanism of facilitating function via disorder-to-order transitions, their presence and significance in pathogenic versus non-pathogenic organisms and their promise of applicability towards the computational prediction of peptides involved in the most common class of post-translational modifications, phosphorylation. Besides these, a new algorithm exploiting the strong correlation between phosphorylation and intrinsic disorder has also been proposed to improve the detection of phosphorylated peptides via high-throughput methods such as tandem mass-spectrometry (LC-MS/MS). Results presented in this study, guide us in understanding the robustness of unstructured regions in proteins to sequence changes and environment, their role in facilitating molecular recognition as well as improving currently available methods for identification of post-translationally modified peptides. The findings and conclusions of this dissertation have the potential to impact ongoing structural genomics initiatives by suggesting alternative methods for determining structure for targets containing regions of disorder. Additional ramifications of results from this work include directing attention towards the possible use of regions of intrinsic disorder by pathogenic organisms for host cell invasion. We believe that unlike the traditional reductionist approach in a scientific method, this study gathers strength and utility by investigating the role of intrinsic disorder on more than one front in order to provide a novel perspective to the understanding of complex interactions within biological systems. Concluding arguments presented in this study pique one's curiosity regarding the evolution of disordered regions and proteins in general. On a technological side, the findings from this study unequivocally support the viable use of informatics methods in gaining new insights about a relatively young class of proteins known as intrinsically disordered proteins and its applicability to improve our present knowledge of cellular physiology

    Optimizing hydropathy scale to improve IDP prediction and characterizing IDPs' functions

    Get PDF
    Indiana University-Purdue University Indianapolis (IUPUI)Intrinsically disordered proteins (IDPs) are flexible proteins without defined 3D structures. Studies show that IDPs are abundant in nature and actively involved in numerous biological processes. Two crucial subjects in the study of IDPs lie in analyzing IDPs’ functions and identifying them. We thus carried out three projects to better understand IDPs. In the 1st project, we propose a method that separates IDPs into different function groups. We used the approach of CH-CDF plot, which is based the combined use of two predictors and subclassifies proteins into 4 groups: structured, mixed, disordered, and rare. Studies show different structural biases for each group. The mixed class has more order-promoting residues and more ordered regions than the disordered class. In addition, the disordered class is highly active in mitosis-related processes among others. Meanwhile, the mixed class is highly associated with signaling pathways, where having both ordered and disordered regions could possibly be important. The 2nd project is about identifying if an unknown protein is entirely disordered. One of the earliest predictors for this purpose, the charge-hydropathy plot (C-H plot), exploited the charge and hydropathy features of the protein. Not only is this algorithm simple yet powerful, its input parameters, charge and hydropathy, are informative and readily interpretable. We found that using different hydropathy scales significantly affects the prediction accuracy. Therefore, we sought to identify a new hydropathy scale that optimizes the prediction. This new scale achieves an accuracy of 91%, a significant improvement over the original 79%. In our 3rd project, we developed a per-residue C-H IDP predictor, in which three hydropathy scales are optimized individually. This is to account for the amino acid composition differences in three regions of a protein sequence (N, C terminus and internal). We then combined them into a single per-residue predictor that achieves an accuracy of 74% for per-residue predictions for proteins containing long IDP regions

    Glutenin and Gliadin, a Piece in the Puzzle of their Structural Properties in the Cell Described through Monte Carlo Simulations

    Get PDF
    Gluten protein crosslinking is a predetermined process where specific intra- and intermolecular disulfide bonds differ depending on the protein and cysteine motif. In this article, all-atom Monte Carlo simulations were used to understand the formation of disulfide bonds in gliadins and low molecular weight glutenin subunits (LMW-GS). The two intrinsically disordered proteins appeared to contain mostly turns and loops and showed "self-avoiding walk" behavior in water. Cysteine residues involved in intramolecular disulfide bonds were located next to hydrophobic peptide sections in the primary sequence. Hydrophobicity of neighboring peptide sections, synthesis chronology, and amino acid chain flexibility were identified as important factors in securing the specificity of intramolecular disulfide bonds formed directly after synthesis. The two LMW-GS cysteine residues that form intermolecular disulfide bonds were positioned next to peptide sections of lower hydrophobicity, and these cysteine residues are more exposed to the cytosolic conditions, which influence the crosslinking behavior. In addition, coarse-grained Monte Carlo simulations revealed that the protein folding is independent of ionic strength. The potential molecular behavior associated with disulfide bonds, as reported here, increases the biological understanding of seed storage protein function and provides opportunities to tailor their functional properties for different applications

    Inherent Structural Disorder and Dimerisation of Murine Norovirus NS1-2 Protein

    Get PDF
    Human noroviruses are highly infectious viruses that cause the majority of acute, non-bacterial epidemic gastroenteritis cases worldwide. The first open reading frame of the norovirus RNA genome encodes for a polyprotein that is cleaved by the viral protease into six non-structural proteins. The first non-structural protein, NS1-2, lacks any significant sequence similarity to other viral or cellular proteins and limited information is available about the function and biophysical characteristics of this protein. Bioinformatic analyses identified an inherently disordered region (residues 1–142) in the highly divergent N-terminal region of the norovirus NS1-2 protein. Expression and purification of the NS1-2 protein of Murine norovirus confirmed these predictions by identifying several features typical of an inherently disordered protein. These were a biased amino acid composition with enrichment in the disorder promoting residues serine and proline, a lack of predicted secondary structure, a hydrophilic nature, an aberrant electrophoretic migration, an increased Stokes radius similar to that predicted for a protein from the pre-molten globule family, a high sensitivity to thermolysin proteolysis and a circular dichroism spectrum typical of an inherently disordered protein. The purification of the NS1-2 protein also identified the presence of an NS1-2 dimer in Escherichia coli and transfected HEK293T cells. Inherent disorder provides significant advantages including structural flexibility and the ability to bind to numerous targets allowing a single protein to have multiple functions. These advantages combined with the potential functional advantages of multimerisation suggest a multi-functional role for the NS1-2 protein

    Computational and biochemical characterizations of anhydrobiosis-related intrinsically disordered proteins.

    Get PDF
    Anhydrobiosis is the remarkable phenomenon of “life without water”. It is a common technique found in plant seeds, and a rare technique utilized by some animals to temporarily stop the clock of life and enter a stasis for up to several millennia by removing all of their cellular water. If this phenomenon can be replicated, then biological and medical materials could be stored at ambient temperatures for centuries, which would address research challenges as well as enhance the availability of medicine in areas of the world where refrigeration, freezing, and cold-chain infrastructure are not developed or infeasible. Furthermore, modifying crop tissues could make them resistant to droughts, addressing one of the greatest threats to food stability around the world. This work utilizes a combination of computational techniques and novel approaches to performing biochemistry without water to elucidate the mechanisms of function of specialized proteins that are responsible for anhydrobiosis in animals, particularly the anhydrobiotic cysts of the brine shrimp Artemia franciscana. A detailed evaluation of the chemical properties of anhydrobiosis-related, intrinsically disordered proteins indicates that there are multiple protein-based strategies to achieve anhydrobiosis, but that late embryogenesis abundant (LEA) proteins are the most well understood. However, the mechanisms of LEA protein function have never been demonstrated, resulting in a wide variety of hypotheses regarding their ability to confer desiccation tolerance. This work demonstrates that a group 1 LEA protein, AfLEA1.1, and a group 6 LEA protein, AfrLEA6, undergo liquid-liquid phase separations during desiccation and thereby transiently form novel protective membraneless organelles which partition specific proteins and nucleic acids. These desiccation-induced cellular compartments are a novel mechanism to explain how LEA proteins confer desiccation tolerance, and the drivers of this behavior have been linked to the consensus sequences that define these LEA proteins. Therefore, the separation of aqueous proteins into a specialize compartment during drying is unlikely to only be a function of AfLEA1.1 and AfrLEA6, but actually the mechanism by which group 1 and group 3 LEA proteins function in plant seeds and anhydrobiotic animals. These results indicate that when water is unavailable, anhydrobiotic organisms substitute it with their own solvents

    Pervasive, conserved secondary structure in highly charged protein regions

    Get PDF
    Understanding how protein sequences confer function remains a defining challenge in molecular biology. Two approaches have yielded enormous insight yet are often pursued separately: structure-based, where sequence-encoded structures mediate function, and disorder-based, where sequences dictate physicochemical and dynamical properties which determine function in the absence of stable structure. Here we study highly charged protein regions (>40% charged residues), which are routinely presumed to be disordered. Using recent advances in structure prediction and experimental structures, we show that roughly 40% of these regions form well-structured helices. Features often used to predict disorder—high charge density, low hydrophobicity, low sequence complexity, and evolutionarily varying length—are also compatible with solvated, variable-length helices. We show that a simple composition classifier predicts the existence of structure far better than well-established heuristics based on charge and hydropathy. We show that helical structure is more prevalent than previously appreciated in highly charged regions of diverse proteomes and characterize the conservation of highly charged regions. Our results underscore the importance of integrating, rather than choosing between, structure- and disorder-based approaches
    corecore