90 research outputs found
Do Natural Proteins Differ from Random Sequences Polypeptides? Natural vs. Random Proteins Classification Using an Evolutionary Neural Network
Are extant proteins the exquisite result of natural selection or are they random sequences slightly edited by evolution? This question has puzzled biochemists for long time and several groups have addressed this issue comparing natural protein sequences to completely random ones coming to contradicting conclusions. Previous works in literature focused on the analysis of primary structure in an attempt to identify possible signature of evolutionary editing. Conversely, in this work we compare a set of 762 natural proteins with an average length of 70 amino acids and an equal number of completely random ones of comparable length on the basis of their structural features. We use an ad hoc Evolutionary Neural Network Algorithm (ENNA) in order to assess whether and to what extent natural proteins are edited from random polypeptides employing 11 different structure-related variables (i.e. net charge, volume, surface area, coil, alpha helix, beta sheet, percentage of coil, percentage of alpha helix, percentage of beta sheet, percentage of secondary structure and surface hydrophobicity). The ENNA algorithm is capable to correctly distinguish natural proteins from random ones with an accuracy of 94.36%. Furthermore, we study the structural features of 32 random polypeptides misclassified as natural ones to unveil any structural similarity to natural proteins. Results show that random proteins misclassified by the ENNA algorithm exhibit a significant fold similarity to portions or subdomains of extant proteins at atomic resolution. Altogether, our results suggest that natural proteins are significantly edited from random polypeptides and evolutionary editing can be readily detected analyzing structural features. Furthermore, we also show that the ENNA, employing simple structural descriptors, can predict whether a protein chain is natural or random
Decoding the Folding of Burkholderia glumae Lipase: Folding Intermediates En Route to Kinetic Stability
The lipase produced by Burkholderia glumae folds spontaneously into an inactive near-native state and requires a periplasmic chaperone to reach its final active and secretion-competent fold. The B. glumae lipase-specific foldase (Lif) is classified as a member of the steric-chaperone family of which the propeptides of α-lytic protease and subtilisin are the best known representatives. Steric chaperones play a key role in conferring kinetic stability to proteins. However, until present there was no solid experimental evidence that Lif-dependent lipases are kinetically trapped enzymes. By combining thermal denaturation studies with proteolytic resistance experiments and the description of distinct folding intermediates, we demonstrate that the native lipase has a kinetically stable conformation. We show that a newly discovered molten globule-like conformation has distinct properties that clearly differ from those of the near-native intermediate state. The folding fingerprint of Lif-dependent lipases is put in the context of the protease-prodomain system and the comparison reveals clear differences that render the lipase-Lif systems unique. Limited proteolysis unveils structural differences between the near-native intermediate and the native conformation and sets the stage to shed light onto the nature of the kinetic barrier
Differences in the Pathways of Proteins Unfolding Induced by Urea and Guanidine Hydrochloride: Molten Globule State and Aggregates
It was shown that at low concentrations guanidine hydrochloride (GdnHCl) can cause aggregation of proteins in partially folded state and that fluorescent dye 1-anilinonaphthalene-8-sulfonic acid (ANS) binds with these aggregates rather than with hydrophobic clusters on the surface of protein in molten globule state. That is why the increase in ANS fluorescence intensity is often recorded in the pathway of protein denaturation by GdnHCl, but not by urea. So what was previously believed to be the molten globule state in the pathway of protein denaturation by GdnHCl, in reality, for some proteins represents the aggregates of partially folded molecules
Spatial Extent of Charge Repulsion Regulates Assembly Pathways for Lysozyme Amyloid Fibrils
Formation of large protein fibrils with a characteristic cross β-sheet architecture is the key indicator for a wide variety of systemic and neurodegenerative amyloid diseases. Recent experiments have strongly implicated oligomeric intermediates, transiently formed during fibril assembly, as critical contributors to cellular toxicity in amyloid diseases. At the same time, amyloid fibril assembly can proceed along different assembly pathways that might or might not involve such oligomeric intermediates. Elucidating the mechanisms that determine whether fibril formation proceeds along non-oligomeric or oligomeric pathways, therefore, is important not just for understanding amyloid fibril assembly at the molecular level but also for developing new targets for intervening with fibril formation. We have investigated fibril formation by hen egg white lysozyme, an enzyme for which human variants underlie non-neuropathic amyloidosis. Using a combination of static and dynamic light scattering, atomic force microscopy and circular dichroism, we find that amyloidogenic lysozyme monomers switch between three different assembly pathways: from monomeric to oligomeric fibril assembly and, eventually, disordered precipitation as the ionic strength of the solution increases. Fibril assembly only occurred under conditions of net repulsion among the amyloidogenic monomers while net attraction caused precipitation. The transition from monomeric to oligomeric fibril assembly, in turn, occurred as salt-mediated charge screening reduced repulsion among individual charged residues on the same monomer. We suggest a model of amyloid fibril formation in which repulsive charge interactions are a prerequisite for ordered fibril assembly. Furthermore, the spatial extent of non-specific charge screening selects between monomeric and oligomeric assembly pathways by affecting which subset of denatured states can form suitable intermolecular bonds and by altering the energetic and entropic requirements for the initial intermediates emerging along the monomeric vs. oligomeric assembly path
An analysis of single amino acid repeats as use case for application specific background models
Background
Sequence analysis aims to identify biologically relevant signals against a backdrop of functionally meaningless variation. Increasingly, it is recognized that the quality of the background model directly affects the performance of analyses. State-of-the-art approaches rely on classical sequence models that are adapted to the studied dataset. Although performing well in the analysis of globular protein domains, these models break down in regions of stronger compositional bias or low complexity. While these regions are typically filtered, there is increasing anecdotal evidence of functional roles. This motivates an exploration of more complex sequence models and application-specific approaches for the investigation of biased regions.
Results
Traditional Markov-chains and application-specific regression models are compared using the example of predicting runs of single amino acids, a particularly simple class of biased regions. Cross-fold validation experiments reveal that the alternative regression models capture the multi-variate trends well, despite their low dimensionality and in contrast even to higher-order Markov-predictors. We show how the significance of unusual observations can be computed for such empirical models. The power of a dedicated model in the detection of biologically interesting signals is then demonstrated in an analysis identifying the unexpected enrichment of contiguous leucine-repeats in signal-peptides. Considering different reference sets, we show how the question examined actually defines what constitutes the 'background'. Results can thus be highly sensitive to the choice of appropriate model training sets. Conversely, the choice of reference data determines the questions that can be investigated in an analysis.
Conclusions
Using a specific case of studying biased regions as an example, we have demonstrated that the construction of application-specific background models is both necessary and feasible in a challenging sequence analysis situation
Probing the urea dependence of residual structure in denatured human α-lactalbumin
Backbone 15N relaxation parameters and 15N–1HN residual dipolar couplings (RDCs) have been measured for a variant of human α-lactalbumin (α-LA) in 4, 6, 8 and 10 M urea. In the α-LA variant, the eight cysteine residues in the protein have been replaced by alanines (all-Ala α-LA). This protein is a partially folded molten globule at pH 2 and has been shown previously to unfold in a stepwise non-cooperative manner on the addition of urea. 15N R2 values in some regions of all-Ala α-LA show significant exchange broadening which is reduced as the urea concentration is increased. Experimental RDC data are compared with RDCs predicted from a statistical coil model and with bulkiness, average area buried upon folding and hydrophobicity profiles in order to identify regions of non-random structure. Residues in the regions corresponding to the B, D and C-terminal 310 helices in native α-LA show R2 values and RDC data consistent with some non-random structural propensities even at high urea concentrations. Indeed, for residues 101–106 the residual structure persists in 10 M urea and the RDC data suggest that this might include the formation of a turn-like structure. The data presented here allow a detailed characterization of the non-cooperative unfolding of all-Ala α-LA at higher concentrations of denaturant and complement previous studies which focused on structural features of the molten globule which is populated at lower concentrations of denaturant
- …