39 research outputs found

    Improving Structural Features Prediction in Protein Structure Modeling

    Get PDF
    Proteins play a vital role in the biological activities of all living species. In nature, a protein folds into a specific and energetically favorable three-dimensional structure which is critical to its biological function. Hence, there has been a great effort by researchers in both experimentally determining and computationally predicting the structures of proteins. The current experimental methods of protein structure determination are complicated, time-consuming, and expensive. On the other hand, the sequencing of proteins is fast, simple, and relatively less expensive. Thus, the gap between the number of known sequences and the determined structures is growing, and is expected to keep expanding. In contrast, computational approaches that can generate three-dimensional protein models with high resolution are attractive, due to their broad economic and scientific impacts. Accurately predicting protein structural features, such as secondary structures, disulfide bonds, and solvent accessibility is a critical intermediate step stone to obtain correct three-dimensional models ultimately. In this dissertation, we report a set of approaches for improving the accuracy of structural features prediction in protein structure modeling. First of all, we derive a statistical model to generate context-based scores characterizing the favorability of segments of residues in adopting certain structural features. Then, together with other information such as evolutionary and sequence information, we incorporate the context-based scores in machine learning approaches to predict secondary structures, disulfide bonds, and solvent accessibility. Furthermore, we take advantage of the emerging high performance computing architectures in GPU to accelerate the calculation of pairwise and high-order interactions in context-based scores. Finally, we make these prediction methods available to the public via web services and software packages

    Structural Exploration and Conformational Transitions in MDM2 upon DHFR Interaction from Homo sapiens

    Get PDF

    Trends in template/fragment-free protein structure prediction

    Get PDF
    Predicting the structure of a protein from its amino acid sequence is a long-standing unsolved problem in computational biology. Its solution would be of both fundamental and practical importance as the gap between the number of known sequences and the number of experimentally solved structures widens rapidly. Currently, the most successful approaches are based on fragment/template reassembly. Lacking progress in template-free structure prediction calls for novel ideas and approaches. This article reviews trends in the development of physical and specific knowledge-based energy functions as well as sampling techniques for fragment-free structure prediction. Recent physical- and knowledge-based studies demonstrated that it is possible to sample and predict highly accurate protein structures without borrowing native fragments from known protein structures. These emerging approaches with fully flexible sampling have the potential to move the field forward

    A Set of Computationally Designed Orthogonal Antiparallel Homodimers that Expands the Synthetic Coiled-Coil Toolkit

    Get PDF
    Molecular engineering of protein assemblies, including the fabrication of nanostructures and synthetic signaling pathways, relies on the availability of modular parts that can be combined to give different structures and functions. Currently, a limited number of well-characterized protein interaction components are available. Coiled-coil interaction modules have been demonstrated to be useful for biomolecular design, and many parallel homodimers and heterodimers are available in the coiled-coil toolkit. In this work, we sought to design a set of orthogonal antiparallel homodimeric coiled coils using a computational approach. There are very few antiparallel homodimers described in the literature, and none have been measured for cross-reactivity. We tested the ability of the distance-dependent statistical potential DFIRE to predict orientation preferences for coiled-coil dimers of known structure. The DFIRE model was then combined with the CLASSY multistate protein design framework to engineer sets of three orthogonal antiparallel homodimeric coiled coils. Experimental measurements confirmed the successful design of three peptides that preferentially formed antiparallel homodimers that, furthermore, did not interact with one additional previously reported antiparallel homodimer. Two designed peptides that formed higher-order structures suggest how future design protocols could be improved. The successful designs represent a significant expansion of the existing protein-interaction toolbox for molecular engineers.National Science Foundation (U.S.) (MCB-0950233)United States. National Institutes of Health (GM67681)National Science Foundation (U.S.) (DBI-0821391

    A global optimization approach for searching low energy conformations of proteins

    Get PDF
    De novo protein structure prediction and understanding the protein folding mechanism is an outstanding challenge of Biological Physics. Relying on the thermodynamic hypothesis of protein folding it is expected that the native state of a protein can be found out if the global minimum of the free energy surface is found. To understand the energy landscape or the free energy surface is challenging. The structure and dynamics of proteins are the manifestations of the underlying potential energy surface. Here the potential energy function stands on a framework of all-atom representation and uses purely physics-based interactions. For the solvated proteins the effective free energy is defined as an implicit solvation model which includes the solvation free energy, along with a standard all-atom biomolecular forcefield. A major challenge is to search for the global minimum on this effective free energy surface. In this work the Minima Hopping Algorithm (MHOP) to find global minima on potential energy surfaces has been used for protein structure prediction or in general finding the lowest energy conformations of proteins. Here proteins have been studied both in vacuo and in the aqueous medium. For short peptides starting from a completely extended conformation we could find conformational minima which are very close to the experimentally observed structures

    The aqueous environment as an active participant in the protein folding process

    Get PDF
    © 2018 The Authors Existing computational models applied in the protein structure prediction process do not sufficiently account for the presence of the aqueous solvent. The solvent is usually represented by a predetermined number of H2O molecules in the bounding box which contains the target chain. The fuzzy oil drop (FOD) model, presented in this paper, follows an alternative approach, with the solvent assuming the form of a continuous external hydrophobic force field, with a Gaussian distribution. The effect of this force field is to guide hydrophobic residues towards the center of the protein body, while promoting exposure of hydrophilic residues on its surface. This work focuses on the following sample proteins: Engrailed homeodomain (RCSB: 1enh), Chicken villin subdomain hp-35, n68h (RCSB: 1yrf), Chicken villin subdomain hp-35, k65(nle), n68h, k70(nle) (RCSB: 2f4k), Thermostable subdomain from chicken villin headpiece (RCSB: 1vii), de novo designed single chain three-helix bundle (a3d) (RCSB: 2a3d), albumin-binding domain (RCSB: 1prb) and lambda repressor-operator complex (RCSB: 1lmb)

    De Novo Protein Structure Modeling and Energy Function Design

    Get PDF
    The two major challenges in protein structure prediction problems are (1) the lack of an accurate energy function and (2) the lack of an efficient search algorithm. A protein energy function accurately describing the interaction between residues is able to supervise the optimization of a protein conformation, as well as select native or native-like structures from numerous possible conformations. An efficient search algorithm must be able to reduce a conformational space to a reasonable size without missing the native conformation. My PhD research studies focused on these two directions. A protein energy function—the distance and orientation dependent energy function of amino acid key blocks (DOKB), containing a distance term, an orientation term, and a highly packed term—was proposed to evaluate the stability of proteins. In this energy function, key blocks of each amino acids were used to represent each residue; a novel reference state was used to normalize block distributions. The dependent relationship between the orientation term and the distance term was revealed, representing the preference of different orientations at different distances between key blocks. Compared with four widely used energy functions using six general benchmark decoy sets, the DOKB appeared to perform very well in recognizing native conformations. Additionally, the highly packed term in the DOKB played its important role in stabilizing protein structures containing highly packed residues. The cluster potential adjusted the reference state of highly packed areas and significantly improved the recognition of the native conformations in the ig_structal data set. The DOKB is not only an alternative protein energy function for protein structure prediction, but it also provides a different view of the interaction between residues. The top-k search algorithm was optimized to be used for proteins containing both α-helices and β-sheets. Secondary structure elements (SSEs) are visible in cryo-electron microscopy (cryo-EM) density maps. Combined with the SSEs predicted in a protein sequence, it is feasible to determine the topologies referring to the order and direction of the SSEs in the cryo-EM density map with respect to the SSEs in the protein sequence. Our group member Dr. Al Nasr proposed the top-k search algorithm, searching the top-k possible topologies for a target protein. It was the most effective algorithm so far. However, this algorithm only works well for pure a-helix proteins due to the complexity of the topologies of β-sheets. Based on the known protein structures in the Protein Data Bank (PDB), we noticed that some topologies in β-sheets had a high preference; on the contrary, some topologies never appeared. The preference of different topologies of β-sheets was introduced into the optimized top-k search algorithm to adjust the edge weight between nodes. Compared with the previous results, this optimization significantly improved the performance of the top-k algorithm in the proteins containing both α-helices and β-sheets

    Scoring predictive models using a reduced representation of proteins: model and energy definition

    Get PDF
    BACKGROUND: Reduced representations of proteins have been playing a keyrole in the study of protein folding. Many such models are available, with different representation detail. Although the usefulness of many such models for structural bioinformatics applications has been demonstrated in recent years, there are few intermediate resolution models endowed with an energy model capable, for instance, of detecting native or native-like structures among decoy sets. The aim of the present work is to provide a discrete empirical potential for a reduced protein model termed here PC2CA, because it employs a PseudoCovalent structure with only 2 Centers of interactions per Amino acid, suitable for protein model quality assessment. RESULTS: All protein structures in the set top500H have been converted in reduced form. The distribution of pseudobonds, pseudoangle, pseudodihedrals and distances between centers of interactions have been converted into potentials of mean force. A suitable reference distribution has been defined for non-bonded interactions which takes into account excluded volume effects and protein finite size. The correlation between adjacent main chain pseudodihedrals has been converted in an additional energetic term which is able to account for cooperative effects in secondary structure elements. Local energy surface exploration is performed in order to increase the robustness of the energy function. CONCLUSION: The model and the energy definition proposed have been tested on all the multiple decoys' sets in the Decoys'R'us database. The energetic model is able to recognize, for almost all sets, native-like structures (RMSD less than 2.0 Å). These results and those obtained in the blind CASP7 quality assessment experiment suggest that the model compares well with scoring potentials with finer granularity and could be useful for fast exploration of conformational space. Parameters are available at the url:

    The aqueous environment as an active participant in the protein folding process

    Get PDF
    Existing computational models applied in the protein structure prediction process do not sufficiently account for the presence of the aqueous solvent. The solvent is usually represented by a predetermined number of H2O molecules in the bounding box which contains the target chain. The fuzzy oil drop (FOD) model, presented in this paper, follows an alternative approach, with the solvent assuming the form of a continuous external hydrophobic force field, with a Gaussian distribution. The effect of this force field is to guide hydrophobic residues towards the center of the protein body, while promoting exposure of hydrophilic residues on its surface. This work focuses on the following sample proteins: Engrailed homeodomain (RCSB: 1 enh), Chicken villin subdomain hp-35, n68h (RCSB: 1yrf), Chicken villin sub-domain hp-35, k65(nle), n68h, k70(nle) (RCSB: 2f4k), Thermostable subdomain from chicken villin headpiece (RCSB: 1vii), de novo designed single chain three-helix bundle (a3d) (RCSB: 2a3d), albumin-binding domain (RCSB: 1prb) and lambda repressor-operator complex (RCSB: 1lmb). (C) 2018 The Authors. Published by Elsevier Inc
    corecore