313 research outputs found

    Nanostructure and molecular mechanics of spider dragline silk protein assemblies

    Get PDF
    Spider silk is a self-assembling biopolymer that outperforms most known materials in terms of its mechanical performance, despite its underlying weak chemical bonding based on H-bonds. While experimental studies have shown that the molecular structure of silk proteins has a direct influence on the stiffness, toughness and failure strength of silk, no molecular-level analysis of the nanostructure and associated mechanical properties of silk assemblies have been reported. Here, we report atomic-level structures of MaSp1 and MaSp2 proteins from the Nephila clavipes spider dragline silk sequence, obtained using replica exchange molecular dynamics, and subject these structures to mechanical loading for a detailed nanomechanical analysis. The structural analysis reveals that poly-alanine regions in silk predominantly form distinct and orderly beta-sheet crystal domains, while disorderly regions are formed by glycine-rich repeats that consist of 31-helix type structures and beta-turns. Our structural predictions are validated against experimental data based on dihedral angle pair calculations presented in Ramachandran plots, alpha-carbon atomic distances, as well as secondary structure content. Mechanical shearing simulations on selected structures illustrate that the nanoscale behaviour of silk protein assemblies is controlled by the distinctly different secondary structure content and hydrogen bonding in the crystalline and semi-amorphous regions. Both structural and mechanical characterization results show excellent agreement with available experimental evidence. Our findings set the stage for extensive atomistic investigations of silk, which may contribute towards an improved understanding of the source of the strength and toughness of this biological superfibre.United States. Office of Naval Research (N00014-08-1-00844)United States. Office of Naval Research (N00014-10-1-0562)National Science Foundation (U.S.) (TeraGrid, grant no. TG-MSS080030

    A Multiobjective Approach Applied to the Protein Structure Prediction Problem

    Get PDF
    Interest in discovering a methodology for solving the Protein Structure Prediction problem extends into many fields of study including biochemistry, medicine, biology, and numerous engineering and science disciplines. Experimental approaches, such as, x-ray crystallographic studies or solution Nuclear Magnetic Resonance Spectroscopy, to mathematical modeling, such as minimum energy models are used to solve this problem. Recently, Evolutionary Algorithm studies at the Air Force Institute of Technology include the following: Simple Genetic Algorithm (GA), messy GA, fast messy GA, and Linkage Learning GA, as approaches for potential protein energy minimization. Prepackaged software like GENOCOP, GENESIS, and mGA are in use to facilitate experimentation of these techniques. In addition to this software, a parallelized version of the fmGA, the so-called parallel fast messy GA, is found to be good at finding semi-optimal answers in reasonable wall clock time. The aim of this work is to apply a Multiobjective approach to solving this problem using a modified fast messy GA. By dividing the CHARMm energy model into separate objectives, it should be possible to find structural configurations of a protein that yield lower energy values and ultimately more correct conformations

    Studying DNA opening during transcription by the RNA polymerase II with molecular dynamics simulations, a sampling challenge

    Get PDF
    Die RNA-Polymerase II (RNAP II) ist ein makromolekularer Komplex, der die RNA aus einer DNA-Matrize synthetisiert. Während des Initiationsschritts der Transkription, öffnet RNAP II die doppelsträngige DNA, um den DNA-Code freizulegen. Da die Bildung der DNA-Transkriptionsblase nur unzureichend verstanden ist, nutzten wir Molekulardynamik-Simulationen (MD), um Erkenntnisse über diesen Prozess zu erlangen. Da die DNA-Öffnung auf Zeitskalen erfolgt, die für einfache MD Simulationen nicht zugänglich sind, prüften wir verschiedene Enhanced Sampling Methoden, um die MD Simulationen zu beschleunigen und den DNA-Öffnungsprozess zu untersuchen. Wir fanden heraus, dass die vielversprechendste Methode zur Untersuchung der DNA-Öffnung die Steuerung von Simulationen mit einer Kombination aus (i) geführter DNA-Rotation und (ii) Path Collective Variables war. Auf diese Weise erhielten wir kontinuierliche atomare Trajektorien des gesamten DNA-Öffnungsprozesses, welche qualitative Einblicke in die Rolle der Protein–DNA Wechselwirkungen im Allgemeinen ermöglichten. Mit dem Ziel die DNA-Öffnung quantitativer zu beschreiben, möchten wir weitere Enhanced Sampling Techniken untersuchen, welche wir auf einen einfachen Prozess anwenden: die Permeation von Fosmidomycin durch das OprO Porin. Es zeigte sich, dass das Replica-Exchange Umbrella Sampling in der Lage ist, die Genauigkeit des Profils der freien Energie drastisch zu erhöhen, im Vergleich zu gewöhnlichem Umbrella Sampling.RNA polymerase II (RNAP II) is a macro-molecular complex that synthesizes RNA by reading the DNA code, a process called transcription. During the initiation step of transcription, RNAP II opens double-stranded DNA in order to read the DNA code. Since formation of the DNA transcription bubble remains poorly understood, we used molecular dynamics simulations (MD) to provide atomic-level insights into this process. Because DNA opening occurs at time-scales that are not accessible to plain MD simulations, we have explored different enhanced sampling methods to accelerate MD simulations enabling to study the DNA opening process. Ultimately, by steering simulations with a combination of (i) guided DNA rotation and (ii) path collective variables, we obtained a continuous atomic trajectories of the complete DNA opening process. The simulations provided qualitative insights into the role of loop dynamics and protein-DNA interactions during DNA opening. With the aim of obtaining a more quantitative description of DNA opening, we decided to further explore alternative enhanced sampling techniques applied on a simpler process, yet still challenging from a sampling perspective, that is drug permeation through the OprO porin. This study showed that replica-exchange umbrella sampling (REUS) is able to drastically increase precision of free energy profiles compared to standard umbrella sampling

    Disordered Proteins: Connecting Sequences to Emergent Properties

    Get PDF
    Many IDPs participate in coupled folding and binding reactions and form alpha helical structures in their bound complexes. Alanine, glycine, or proline scanning mutagenesis approaches are often used to dissect the contributions of intrinsic helicities to coupled folding and binding. These experiments can yield confounding results because the mutagenesis strategy changes the amino acid compositions of IDPs. Therefore, an important next step in mutagenesis-based approaches to mechanistic studies of coupled folding and binding is the design of sequences that satisfy three major constraints. These are (i) achieving a target intrinsic alpha helicity profile; (ii) fixing the positions of residues corresponding to the binding interface; and (iii) maintaining the native amino acid composition. Here, we report the development of a Genetic Algorithm for Design of Intrinsic secondary Structure (GADIS) for designing sequences that satisfy the specified constraints. We describe the algorithm and present results to demonstrate the applicability of GADIS by designing sequence variants of the intrinsically disordered PUMA system that undergoes coupled folding and binding to Mcl-1. Our sequence designs span a range of intrinsic helicity profiles. The predicted variations in sequence-encoded mean helicities are tested against experimental measurements.There is a significant collection of proteins with repeating blocks of oppositely charged residues where the consensus sequence is a block of four Glu residues followed by a block of four Lys or Arg residues, (Glu4(Lys/Arg)4)n. These proteins have been experimentally shown to form long single alpha helices (SAHs) under biologically relevant conditions. However, these results are confounding to disorder predictors and to certain atomistic simulations in that both predict these sequences to be strongly disordered. The current working hypothesis is that SAHs are stabilized by i:i+4 salt bridges between opposite charges in consecutive helical turns. We test the merits of this hypothesis to understand the sequence-encoded preference for SAHs and the logic behind the failure of certain atomistic simulations in anticipating the preference for stable SAHs.In simulations with fixed charges the favorable free energy of solvation of charged residues and the associated loss of sidechain entropy hinders the formation of SAHs. We proposed that alterations to charge states induced by sequence context might play an important role in stabilizing SAHs. We tested this hypothesis using a (Glu4Lys4)n repeat protein and a simulation strategy that permits the substitution of charged residues with neutralized protonated or deprotonated variants of Glu / Lys. Our results predict that stable SAH structures derive from the neutralization of approximately half the Glu residues. These findings explain experimental observations and also provide a coherent rationale for the failure of simulations based on fixed charge models. Large-scale sequence analysis reveals that naturally occurring sequences often include defects in charge patterns such as Gln or Ala substitutions. This sequence-encoded incorporation of uncharged residues combined with neutralization of charged residues might tilt the balance toward alpha helical conformations.Micron-sized, non-membrane bound cellular bodies can form as the result of collective interactions between modules of distinct multidomain proteins. Li et al. have examined the phase diagrams that result for polymers of SH3 domains and proline-rich modules (PRMs) while varying the number of interacting domains. It is noteworthy that flexible, intrinsically disordered linkers connect the interacting units within each polymer. Conventional wisdom holds that linkers play a passive role in determining the phase behavior of multidomain proteins that undergo phase separations. Here, we ask if this view is accurate. The motivation for our work comes from recent studies that have uncovered a rich diversity of composition-to-conformation and sequence-to-conformation relationships for intrinsically disordered proteins. The central finding is that disordered regions of proteins have distinct sequence-encoded conformational preferences. Accordingly, we reasoned that the conformational properties of linkers might be a contributing factor, in addition to polyvalency, to the phase behavior of multidomain proteins.We have developed and deployed a three-dimensional lattice model to arrive at a predictive framework to query the effects of linkers on the phase diagrams of polyvalent systems. We find that the critical concentration for phase transition can be influenced by the conformational properties of linkers. Specifically, our results show that linkers modulate the cooperative binding between domains of polymers that are already bound together. Depending on their conformational properties, linkers can also block access to the binding domains via excluded volume effects. Additionally, we find that the properties of the linkers can lead to controls over the mixing of proteins in these bodies. Specifically, we find that there are large ranges of parameters for three protein systems where the bodies isolate specific proteins to different regions of the bodies instead of uniformly mixing them. This result is validated by recent findings of organization inside some observed bodies

    Exploring the Universe of Protein Structures beyond the Protein Data Bank

    Get PDF
    It is currently believed that the atlas of existing protein structures is faithfully represented in the Protein Data Bank. However, whether this atlas covers the full universe of all possible protein structures is still a highly debated issue. By using a sophisticated numerical approach, we performed an exhaustive exploration of the conformational space of a 60 amino acid polypeptide chain described with an accurate all-atom interaction potential. We generated a database of around 30,000 compact folds with at least of secondary structure corresponding to local minima of the potential energy. This ensemble plausibly represents the universe of protein folds of similar length; indeed, all the known folds are represented in the set with good accuracy. However, we discover that the known folds form a rather small subset, which cannot be reproduced by choosing random structures in the database. Rather, natural and possible folds differ by the contact order, on average significantly smaller in the former. This suggests the presence of an evolutionary bias, possibly related to kinetic accessibility, towards structures with shorter loops between contacting residues. Beside their conceptual relevance, the new structures open a range of practical applications such as the development of accurate structure prediction strategies, the optimization of force fields, and the identification and design of novel folds

    Sequence-structure correlations in the MaSp1 protein of N. clavipes dragline silk

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Mechanical Engineering, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 77-86).Silk is a hierarchically structured protein fiber with exceptional tensile strength and extensibility, making it one of the toughest and most versatile biocompatible materials. While experimental studies have shown that the molecular structure of silk has a direct influence on the stiffness, toughness, and failure strength of silk, few molecular-level analyses of the nanostructure of silk assemblies, in particular under variations of genetic sequences, have been published. Here, atomistic-level structures of wildtype as well as modified MaSp1 protein from the N. clavipes spider dragline silk sequences are reported, obtained using an in silico approach based on replica exchange molecular dynamics (REMD) and explicit water molecular dynamics. In particular, the atomistic simulations discussed in this parametric study explore the effects of the poly-alanine length of the N. clavipes MaSpi peptide sequence, solvent conditions, and nanomechanical loading conditions on secondary and tertiary structure predictions as well as the nanomechanical behavior of a unit cell of 15 strands with 900-1000 total residues used to represent a cross-linking 7-sheet crystal node in the network within a fibril of the dragline silk thread. Understanding the behavior of this node at the molecular scale is critical for potentially bypassing strength limits at this length scale and vastly improving silk for medical and textile purposes as well as synthetic elastomers and polymer or aramid fiber composites with a similar molecular structure and noncovalent bonding for aerospace, armor, and medical applications. The main hypothesis tested is that there exists a critical minimum length of the poly-alanine repeat that ensures the formation of a robust cross-linking the [beta]-sheet crystal. Confirming earlier experimental and computational work, a structural analysis reveals that poly-alanine regions in silk predominantly form distinct and orderly [beta]-sheet crystal domains while disorderly regions are formed by glycine-rich repeats that consist of 310-helix type structures and 7-turns. These predictions are directly validated against experimental data based on dihedral angle pair calculations presented in Ramachandran plots combined with an analysis of the secondary structure content. The key results of this study are: e A strong dependence of the resulting silk nanostructure on the poly-alanine length. The wildtype poly-alanine repeat length of six residues defines a critical minimum length that consistently results in clearly defined [beta]-sheet nanocrystals allowing for misalignment. For poly-alanine lengths below six residues, the /-sheet nanocrystals are not well-defined or not visible at all, while for poly-alanine lengths above six the characteristic nanocomposite structure of silk emerges with no significant improvement of the quality of the sheet nanocrystal geometry. A simple biophysical model is presented that explains the minimum length scale based on the mechanistic insight gained from the molecular simulations. The efficient stacking of the [beta]-sheets of a well-defined crystal reinforces local hydrophobicity and prevents water diffusion into a crystal above a critical size. Nanomechanical testing reveals that the combination of the 12-alanine length case and central pull-out loading conditions results in delayed failure by employing a hierarchy of strong [beta]-sheets and soft, extensible semi-amorpous regions to overcome a predicted H-bond saturation. This work constitutes the most comprehensive study to-date of the molecular structure prediction and nanomechanical behavior of dragline silk. Building upon previous computational studies that used similar methods for structure prediction and mechanical analysis, e.g. REMD and force-control loading, this work presents: the first results of the near-native structures determined by REMD after equilibration in TIP3P explicit solvent, the first parametric study of the effects of modifying the wildtype poly-alanine segment length to values outside the range naturally observed for MaSp1 on structure prediction and nanomechanical behavior, and, the first comparison between previously published loading conditions, i.e. the Stretch test, and the novel Pull-out loading conditions that are hypothesized to be more appropriate for modeling of the in situ loading of the cross-linking [beta]-sheet crystal. Further parametric studies in peptide sequence to optimize bulk fiber properties must involve changes in simulated nanomechanical loading conditions to properly assess the effects of the changes in peptide sequence. These findings set the stage for understanding how variations in the spidroin sequence can be used to engineer the structure and thereby functional properties of this biological superfiber, and present a design strategy for the genetic optimization of spidroins for enhanced mechanical properties. The approach used here may also find application in the design of other self-assembled molecular structures and fibers and in particular biologically inspired or completely synthetic systems.by Graham Hayden Bratzel.S.M

    Protein physics by advanced computational techniques: conformational sampling and folded state discrimination

    Get PDF
    Proteins are essential parts of organisms and participate in virtually every process within cells. Many proteins are enzymes that catalyze biochemical reactions and are vital to metabolism. Proteins also have structural or mechanical functions, such as actin and myosin in muscle that are in charge of motion and locomotion of cells and organisms. Others proteins are important for transporting materials, cell signaling, immune response, and several other functions. Proteins are the main building blocks of life. A protein is a polymer chain of amino acids whose sequence is defined in a gene: three nucleo type basis specify one out of the 20 natural amino acids. All amino acids possess common structural features. They have an \u3b1-carbon to which an amino group, a carboxyl group, a hydrogen atom and a variable side chain are attached. In a protein, the amino acids are linked together by peptide bonds between the carboxyl and amino groups of adjacent residues..

    Computational Modeling and Design of Protein and Polymeric Nano-Assemblies

    Get PDF
    Advances in nanotechnology have the potential to utilize biological and polymeric systems to address fundamental scientific and societal issues, including molecular electronics and sensors, energy-relevant light harvesting, â??greenâ?? catalysis, and environmental cleanup. In many cases, synthesis and fabrication are well within grasp, but designing such systems requires simultaneous consideration of large numbers of degrees of freedom including structure, sequence, and functional properties. In the case of protein design, even simply considering amino acid identity scales exponentially with the protein length. This work utilizes computational techniques to develop a fundamental, molecularly detailed chemical and physical understanding to investigate and design such nano-assemblies. Throughout, we leverage a probabilistic computational design approach to guide the identification of protein sequences that fold to predetermined structures with targeted function. The statistical methodology is encapsulated in a computational design platform, recently reconstructed with improvements in speed and versatility, to estimate site-specific probabilities of residues through the optimization of an effective sequence free energy. This provides an information-rich perspective on the space of possible sequences which is able to harness the incorporation of new constraints that fit design objectives. The approach is applied to the design and modeling of protein systems incorporating non-biological cofactors, namely (i) an aggregation prone peptide assembly to bind uranyl and (ii) a protein construct to encapsulate a zinc porphyrin derivative with unique photo-physical properties. Additionally, molecular dynamics simulations are used to investigate purely synthetic assemblies of (iii) highly charged semiconducting polymers that wrap and disperse carbon nanotubes. Free energy calculations are used to explore the factors that lead to observed polymer-SWNT super-structures, elucidating well-defined helical structures; for chiral derivatives, the simulations corroborate a preference for helical handedness observed in TEM and AFM data. The techniques detailed herein, demonstrate how advances in computational chemistry allot greater control and specificity in the engineering of novel nano-materials and offer the potential to greatly advance applications of these systems

    Quantification of Conformational Heterogeneity and its Role in Protein Aggregation and Unfolding

    Get PDF
    Proteins can exhibit significant conformational heterogeneity either under denaturing conditions or in aqueous solutions. The latter is true for a class of proteins whose sequences predispose them to form heterogeneous ensembles of conformations. Characterization of conformational heterogeneity in a protein ensemble requires the quantification of the amplitudes of spontaneous fluctuations in conjunction with information regarding coarse grain measures that report on the average sizes, shapes, and densities. This often demands multiplexed experimental approaches whose readouts are interpreted or annotated using ensembles drawn from atomistic or coarse grain computational simulations. Efforts to characterize conformational heterogeneity contribute directly to our understanding of disorder-to-order transitions in protein folding and self-assembly. These efforts are also crucial to our understanding of the heterotypic interactions involving intrinsically disordered proteins and non-native states of well-folded proteins. These heterotypic interactions are important in signal transduction and the regulation of protein homeostasis. The onset and progression of several systemic and neurodegenerative conformational diseases are linked to the nature and degree of conformational heterogeneity in specific proteins or proteolytic products of proteins. This thesis work focuses on the quantitative characterization of conformational heterogeneity in simulated ensembles of inducibly unfolded and intrinsically disordered proteins. Advances in nuclear magnetic resonance spectroscopy afford the possibility of detailed measurements of inter-residue distances and modulations to the relaxation dynamics of paramagnetic spins that are inserted as probes into a protein. These state-of-the-art measurements show interesting features within denatured state ensembles that cannot be explained using canonical random coil models. Here, we use computer simulations to generate plausible facsimiles of denatured state ensembles that reproduce experimental data and demonstrate that the ensembles that are consistent with the data are characterized by the presence of low-likelihood, long-range intra-chain contacts between hydrophobic groups. When placed in the context of sequence conservation information, it appears that these contacts act as gatekeepers that protect proteins from the deleterious consequences of protein aggregation by sequestering hydrophobic groups in an assortment of intra-chain long-range contacts. We also characterize the nature and degree of conformational heterogeneity in glutamine- and asparagine-rich containing systems. These efforts lead to insights regarding the role of conformational heterogeneity in mediating intermolecular associations that are implicated in aggregation and self-assembly of these systems. Analysis of results from atomistic simulations leads to a phenomenological model for the modulation of conformational heterogeneity and degeneracies of intermolecular interactions by naturally occurring sequences that flank polyglutamine domains. Finally, we develop a formal order parameter to quantify the conformational heterogeneity in simulated ensembles of proteins. When combined with measures of density and fluctuations thereof, it can be used to provide a complete description of the degree and nature of conformational heterogeneity in different ensembles, thus affording the ability to compare different ensembles to each other while also providing a way to categorize conformational transitions
    • …
    corecore