16 research outputs found

    GalaxyGPCRloop: Template-Based and <i>Ab Initio</i> Structure Sampling of the Extracellular Loops of G‑Protein-Coupled Receptors

    No full text
    The second extracellular loops (ECL2s) of G-protein-coupled receptors (GPCRs) are often involved in GPCR functions, and their structures have important implications in drug discovery. However, structure prediction of ECL2 is difficult because of its long length and the structural diversity among different GPCRs. In this study, a new ECL2 conformational sampling method involving both template-based and <i>ab initio</i> sampling was developed. Inspired by the observation of similar ECL2 structures of closely related GPCRs, a template-based sampling method employing loop structure templates selected from the structure database was developed. A new metric for evaluating similarity of the target loop to templates was introduced for template selection. An <i>ab initio</i> loop sampling method was also developed to treat cases without highly similar templates. The <i>ab initio</i> method is based on the previously developed fragment assembly and loop closure method. A new sampling component that takes advantage of secondary structure prediction was added. In addition, a conserved disulfide bridge restraining ECL2 conformation was predicted and analytically incorporated into sampling, reducing the effective dimension of the conformational search space. The sampling method was combined with an existing energy function for comparison with previously reported loop structure prediction methods, and the benchmark test demonstrated outstanding performance

    Protein Loop Modeling Using a New Hybrid Energy Function and Its Application to Modeling in Inaccurate Structural Environments

    No full text
    <div><p>Protein loop modeling is a tool for predicting protein local structures of particular interest, providing opportunities for applications involving protein structure prediction and <i>de novo</i> protein design. Until recently, the majority of loop modeling methods have been developed and tested by reconstructing loops in frameworks of experimentally resolved structures. In many practical applications, however, the protein loops to be modeled are located in inaccurate structural environments. These include loops in model structures, low-resolution experimental structures, or experimental structures of different functional forms. Accordingly, discrepancies in the accuracy of the structural environment assumed in development of the method and that in practical applications present additional challenges to modern loop modeling methods. This study demonstrates a new strategy for employing a hybrid energy function combining physics-based and knowledge-based components to help tackle this challenge. The hybrid energy function is designed to combine the strengths of each energy component, simultaneously maintaining accurate loop structure prediction in a high-resolution framework structure and tolerating minor environmental errors in low-resolution structures. A loop modeling method based on global optimization of this new energy function is tested on loop targets situated in different levels of environmental errors, ranging from experimental structures to structures perturbed in backbone as well as side chains and template-based model structures. The new method performs comparably to force field-based approaches in loop reconstruction in crystal structures and better in loop prediction in inaccurate framework structures. This result suggests that higher-accuracy predictions would be possible for a broader range of applications. The web server for this method is available at <a href="http://galaxy.seoklab.org/loop" target="_blank">http://galaxy.seoklab.org/loop</a> with the PS2 option for the scoring function.</p></div

    Examples of loops modeled in inaccurate environmental structures.

    No full text
    <p>In all panels, the crystal structures are colored in green and the models in magenta. Framework structures are shown transparent for clarity. (A) Two examples of tolerating errors in surrounding side chains, 1oyc (left; RMSD = 0.4 Å) and 1c5e (right; RMSD = 0.5 Å). The loop-framework salt bridges in the crystal structures are indicated with black dotted lines. High-accuracy modeling is possible even though the salt bridges cannot be recovered owing to the perturbed arginine orientations in the framework. (B) An example of unsuccessful modeling in the framework of perturbed side-chains, 1oth (RMSD = 2.3 Å), showing the necessity of additional sampling. The perturbed Arg66 and Tyr345 side chains (magenta) would clash with the two leucine residues in the loop if the crystal loop structure were to be placed. (C) Two examples of tolerating additional backbone errors, 1my7 (left; RMSD = 1.0 Å) and 1cb0 (right; RMSD = 0.9 Å). The overall backbone trace and key side-chain interactions are well reproduced.</p

    A successful example of loop modeling in the framework of a template-based model.

    No full text
    <p>The crystal structure is colored in green and the model in magenta (1avk, RMSD = 1.5 Å). Framework structures are shown transparent for clarity. Loops of three templates (used for template-based modeling) are shown with yellow transparent ribbons for comparison.</p

    Distributions of environmental errors for the three types of test sets employed in the study.

    No full text
    <p>(A) for the test set of crystal structures with perturbed side chains, (B) for the crystal structures with both backbone and side chains perturbed, and (C) for the template-based models. The gray curve behind the histogram represents an interpolation. The average E-RMSD values are 0.9 Å, 2.1 Å, and 2.8 Å for the side chain-perturbed set (A), the backbone-perturbed set (B), and the template-based model set (C), respectively. E-RMSD represents the all-atom RMSD of environment residues for which any atoms are within 10 Å from any loop C<sub>β</sub> atoms.</p

    Sampling results of GalaxyLoop-PS2 on the three test sets.

    No full text
    1)<p>Number of loop targets for which at least one structure among the 30 loop conformations (or 50 conformations for 12-residue loops) in the final CSA bank is within a given RMSD value.</p><p>Sampling results of GalaxyLoop-PS2 on the three test sets.</p

    Comparison of loop modeling results on the test set of template-based models.

    No full text
    <p>The average RMSD and its standard deviation are reported in Å. The Loop RMSD is calculated as the root-mean-square deviation of the main-chain atoms N, C<sub>α</sub>, C, and O.</p>1)<p>Loop conformations generated by MODELLER <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0113811#pone.0113811-Sali1" target="_blank">[30]</a>.</p>2)<p>Loop conformations generated by loop refinement using ModLoop of MODELLER <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0113811#pone.0113811-Fiser1" target="_blank">[1]</a>, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0113811#pone.0113811-Fiser2" target="_blank">[27]</a>.</p>3)<p>Results of the best-score models sampled by Next-generation KIC (NGK) using the protocol provided by Stein <i>et al.</i><a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0113811#pone.0113811-Stein1" target="_blank">[18]</a>.</p><p>500 models were generated for each target as in Stein <i>et al.</i> The Rosetta program v3.5 was used.</p>4)<p>Loop set constructed in this study. See <b><a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0113811#pone.0113811.s008" target="_blank">Table S7</a></b> for the list of loops.</p><p>Comparison of loop modeling results on the test set of template-based models.</p

    Protein Ensemble Generation Through Variational Autoencoder Latent Space Sampling

    No full text
    Mapping the ensemble of protein conformations that contribute to function and can be targeted by small molecule drugs remains an outstanding challenge. Here, we explore the use of variational autoencoders for reducing the challenge of dimensionality in the protein structure ensemble generation problem. We convert high-dimensional protein structural data into a continuous, low-dimensional representation, carry out a search in this space guided by a structure quality metric, and then use RoseTTAFold guided by the sampled structural information to generate 3D structures. We use this approach to generate ensembles for the cancer relevant protein K-Ras, train the VAE on a subset of the available K-Ras crystal structures and MD simulation snapshots, and assess the extent of sampling close to crystal structures withheld from training. We find that our latent space sampling procedure rapidly generates ensembles with high structural quality and is able to sample within 1 Ã… of held-out crystal structures, with a consistency higher than that of MD simulation or AlphaFold2 prediction. The sampled structures sufficiently recapitulate the cryptic pockets in the held-out K-Ras structures to allow for small molecule docking

    Comparison of loop modeling results by the average RMSD of main chain atoms (N, C<sub>α</sub>, C, and O) of loops in angstroms (Å) on test sets of varying environmental accuracies measured by E-RMSD.

    No full text
    <p>Standard deviations are also reported.</p>1)<p>Loop sampling methods sample only the loop region, while extended sampling methods sample surrounding side chains in addition to the loop.</p>2)<p>Taken from Sellers <i>et al.</i><a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0113811#pone.0113811-Sellers1" target="_blank">[15]</a>.</p>3)<p>Taken from Mandell <i>et al.</i><a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0113811#pone.0113811-Mandell1" target="_blank">[16]</a>.</p>4)<p>Results of the best-score models out of 500 models sampled for each target following the protocol provided by Stein <i>et al.</i><a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0113811#pone.0113811-Stein1" target="_blank">[18]</a> with Rosetta v3.5.</p><p>The results for the crystal structure set and the side chain-perturbed set are the same for NGK because extended sampling of loop environment was used for both sets.</p>5)<p>Loop sets taken from Jacobson <i>et al.</i><a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0113811#pone.0113811-Jacobson1" target="_blank">[10]</a>. See <b><a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0113811#pone.0113811.s002" target="_blank">Tables S1</a></b> and <b><a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0113811#pone.0113811.s003" target="_blank">S2</a></b> for the list of loops.</p>6)<p>Loop sets from Zhu <i>et al.</i><a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0113811#pone.0113811-Zhu1" target="_blank">[34]</a>. See <b><a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0113811#pone.0113811.s002" target="_blank">Tables S1</a></b> and <b><a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0113811#pone.0113811.s003" target="_blank">S2</a></b> for the list of loops.</p>7)<p>Loop set from Fiser <i>et al.</i><a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0113811#pone.0113811-Fiser1" target="_blank">[1]</a>. See <b><a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0113811#pone.0113811.s004" target="_blank">Tables S3</a></b> for the list of loops.</p><p>Comparison of loop modeling results by the average RMSD of main chain atoms (N, C<sub>α</sub>, C, and O) of loops in angstroms (Å) on test sets of varying environmental accuracies measured by E-RMSD.</p

    Simultaneous Optimization of Biomolecular Energy Functions on Features from Small Molecules and Macromolecules

    No full text
    Most biomolecular modeling energy functions for structure prediction, sequence design, and molecular docking have been parametrized using existing macromolecular structural data; this contrasts molecular mechanics force fields which are largely optimized using small-molecule data. In this study, we describe an integrated method that enables optimization of a biomolecular modeling energy function simultaneously against small-molecule thermodynamic data and high-resolution macromolecular structural data. We use this approach to develop a next-generation Rosetta energy function that utilizes a new anisotropic implicit solvation model, and an improved electrostatics and Lennard-Jones model, illustrating how energy functions can be considerably improved in their ability to describe large-scale energy landscapes by incorporating both small-molecule and macromolecule data. The energy function improves performance in a wide range of protein structure prediction challenges, including monomeric structure prediction, protein–protein and protein–ligand docking, protein sequence design, and prediction of the free energy changes by mutation, while reasonably recapitulating small-molecule thermodynamic properties
    corecore