254 research outputs found

    Protein Structure Determination Using Chemical Shifts

    Full text link
    In this PhD thesis, a novel method to determine protein structures using chemical shifts is presented.Comment: Univ Copenhagen PhD thesis (2014) in Biochemistr

    How reliably can we predict the reliability of protein structure predictions?

    Get PDF
    Background: Comparative methods have been the standard techniques for in silico protein structure prediction. The prediction is based on a multiple alignment that contains both reference sequences with known structures and the sequence whose unknown structure is predicted. Intensive research has been made to improve the quality of multiple alignments, since misaligned parts of the multiple alignment yield misleading predictions. However, sometimes all methods fail to predict the correct alignment, because the evolutionary signal is too weak to find the homologous parts due to the large number of mutations that separate the sequences. Results: Stochastic sequence alignment methods define a posterior distribution of possible multiple alignments. They can highlight the most likely alignment, and above that, they can give posterior probabilities for each alignment column. We made a comprehensive study on the HOMSTRAD database of structural alignments, predicting secondary structures in four different ways. We showed that alignment posterior probabilities correlate with the reliability of secondary structure predictions, though the strength of the correlation is different for different protocols. The correspondence between the reliability of secondary structure predictions and alignment posterior probabilities is the closest to the identity function when the secondary structure posterior probabilities are calculated from the posterior distribution of multiple alignments. The largest deviation from the identity function has been obtained in the case of predicting secondary structures from a single optimal pairwise alignment. We also showed that alignment posterior probabilities correlate with the 3D distances between C α amino acids in superimposed tertiary structures. Conclusion: Alignment posterior probabilities can be used to a priori detect errors in comparative models on the sequence alignment level. </p

    Computational Methods for Conformational Sampling of Biomolecules

    Get PDF

    All-Atom Modeling of Protein Folding and Aggregation

    Get PDF
    Theoretical investigations of biorelevant processes in the life-science research require highly optimized simulation methods. Therefore, massively parallel Monte Carlo algorithms, namely MTM, were successfully developed and applied to the field of reversible protein folding allowing the thermodynamic characterization of proteins on an atomistic level. Further, the formation process of trans-membrane pores in the TatA system could be elucidated and the structure of the complex could be predicted

    Learning substitution parameters on phylogenetic trees

    Get PDF
    L'evolució de seqüències moleculars es pot modelitzar utilitzant un procés de Markov en un arbre filogenètic, una de les representacions més habituals de les relacions evolutives entre diferents entitats biològiques. Sota condicions teòriques, és a dir, utilitzant seqüències de longitud infinita, els paràmetres evolutius de substitució de nucleòtids es poden estimar de forma exacta, tal i com va demostrar J.T. Chang. En aquest projecte, adaptem els resultats de Chang a escenaris reals per tal de poder estimar els paràmetres en un arbre filogenètic utilitzant seqüències de longitud finita. Amb aquest objectiu, implementem l'algoritme FullRecon proposat per E. Mossel i S. Roch, el qual ens permet estimar las matrius de transició que expliquen l'evolució de les seqüències. A més a més, desenvolupem un simulador d'alineaments que, donat un arbre, genera les seves matrius de transició amb un numero prefixat de substitucions i un alineament amb les seqüències de les fulles. El simulador ens permet estudiar el rendiment de l'algoritme FullRecon i identificar els factors que hi tenen més inflüència. Finalment, comparem les estimacions fetes amb FullRecon amb les d'un altre software dedicat a l'anàlisi filogenètic, IQ-TREE.The evolution of molecular sequences can be modeled by a Markov process on a phylogenetic tree, a common representation of the evolutionary relationships among biological entities. Under theoretical conditions, that is, using sequences of infinite length, the evolutionary parameters of nucleotidesubstitution can be exactly recovered, as J.T. Chang proved. In this project, we adapt Chang's results to real-world scenarios in order to estimate the parameters on a phylogenetic tree using finite sequences. To do so, we implement the FullRecon algorithm proposed by E. Mossel and S. Roch, which allows us to recover the transition matrices that model the sequences evolution. Moreover, we develop an alignment simulator that given a tree, generates its transition matrices with a pre-set number of substitutions and an alignment with the leaf sequences. The simulator allows us to study the performance of FullRecon and identify the factors that have most influence on it. Finally, we compare FullRecon estimations against the ones performed by another phylogenetic software, IQ-TREE

    Exploration of the Disambiguation of Amino Acid Types to Chi-1 Rotamer Types in Protein Structure Prediction and Design

    Full text link
    A protein’s global fold provide insight into function; however, function specificity is often detailed in sidechain orientation. Thus, determining the rotamer conformations is often crucial in the contexts of protein structure/function prediction and design. For all non-glycine and non-alanine types, chi-1 rotamers occupy a small number of discrete number of states. Herein, we explore the possibility of describing evolution from the perspective of the sidechains’ structure versus the traditional twenty amino acid types. To validate our hypothesis that this perspective is more crucial to our understanding of evolutionary relationships, we investigate its uses as evolutionary, substitution matrices for sequence alignments for fold recognition purposes and computational protein design with specific focus in designing beta sheet environments, where previous studies have been done on amino acid-types alone. Throughout this study, we also propose the concept of the “chi-1 rotamer sequence” that describes the chi-1 rotamer composition of a protein. We also present attempts to predict these sequences and real-value torsion angles from amino acid sequence information. First, we describe our developments of log-odds scoring matrices for sequence alignments. Log-odds substitution matrices are widely used in sequence alignments for their ability to determine evolutionary relationship between proteins. Traditionally, databases of sequence information guide the construction of these matrices which illustrates its power in discovering distant or weak homologs. Weak homologs, typically those that share low sequence identity (< 30%), are often difficult to identify when only using basic amino acid sequence alignment. While protein threading approaches have addressed this issue, many of these approaches include sequenced-based information or profiles guided by amino acid-based substitution matrices, namely BLOSUM62. Here, we generated a structural-based substitution matrix born by TM-align structural alignments that captures both the sequence mutation rate within same protein family folds and the chi-1 rotamer that represents each amino acid. These rotamer substitution matrices (ROTSUMs) discover new homologs and improved alignments in the PDB that traditional substitution matrices, based solely on sequence information, cannot identify. Certain tools and algorithms to estimate rotamer torsions angles have been developed but typically require either knowledge of backbone coordinates and/or experimental data to help guide the prediction. Herein, we developed a fragment-based algorithm, Rot1Pred, to determine the chi-1 states in each position of a given amino acid sequence, yielding a chi-1 rotamer sequence. This approach employs fragment matching of the query sequence to sequence-structure fragment pairs in the PDB to predict the query’s sidechain structure information. Real-value torsion angles were also predicted and compared against SCWRL4. Results show that overall and for most amino-acid types, Rot1Pred can calculate chi-1 torsion angles significantly closer to native angles compared to SCWRL4 when evaluated on I-TASSER generated model backbones. Finally, we’ve developed and explored chi-1-rotamer-based statistical potentials and evolutionary profiles constructed for de novo computational protein design. Previous analyses which aim to energetically describe the preference of amino acid types in beta sheet environments (parallel vs antiparallel packing or n- and c-terminal beta strand capping) have been performed with amino acid types although no explicit rotamer representation is given in their scoring functions. In our study, we construct statistical functions which describes chi-1 rotamer preferences in these environments and illustrate their improvement over previous methods. These specialized knowledge-based energy functions have generated sequences whose I-TASSER predicted models are structurally-alike to their input structures yet consist of low sequence identity.PHDChemical BiologyUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/145951/1/jarrettj_1.pd

    Primary Structure and Solution Conditions Determine Conformational Ensemble Properties of Intrinsically Disordered Proteins

    Get PDF
    Intrinsically disordered proteins (IDPs) are a class of proteins that do not exhibit well-defined three-dimensional structures. The absence of structure is intrinsic to their amino acid sequences, which are characterized by low hydrophobicity and high net charge per residue compared to folded proteins. Contradicting the classic structure-function paradigm, IDPs are capable of interacting with high specificity and affinity, often acquiring order in complex with protein and nucleic acid binding partners. This phenomenon is evident during cellular activities involving IDPs, which include transcriptional and translational regulation, cell cycle control, signal transduction, molecular assembly, and molecular recognition. Although approximately 30% of eukaryotic proteomes are intrinsically disordered, the nature of IDP conformational ensembles remains unclear. In this dissertation, we describe relationships connecting characteristics of IDP conformational ensembles to their primary structures and solution conditions. Using molecular simulations and fluorescence experiments on a set of base-rich IDPs, we find that net charge per residue segregates conformational ensembles along a globule-to-coil transition. Speculatively generalizing this result, we propose a phase diagram that predicts an IDP\u27s average size and shape based on sequence composition and use it to generate hypotheses for a broad set of intrinsically disordered regions (IDRs). Simulations reveal that acid-rich IDRs, unlike their oppositely charged base-rich counterparts, exhibit disordered globular ensembles despite intra-chain repulsive electrostatic interactions. This apparent asymmetry is sensitive to simulation parameters for representing alkali and halide salt ions, suggesting that solution conditions modulate IDP conformational ensembles. We refine the ion parameters using a calibration procedure that relies exclusively on crystal lattice properties. Simulations with these parameters recover swollen coil behavior for acid-rich IDRs, but also uncover a dependence on sequence patterning for polyampholytic IDPs. These contributions initiate an endeavor to elucidate general principles that enable prediction of an IDP\u27s conformational ensemble based on primary structure and solution conditions, a goal analogous to structure prediction for folded proteins. Such principles would provide a molecular basis for understanding the roles of IDPs in physiology and pathophysiology, guide development of agents that modulate their behavior, and enable their rational design from chosen specifications
    corecore