Search CORE

16 research outputs found

GalaxyGPCRloop: Template-Based and Ab Initio Structure Sampling of the Extracellular Loops of G‑Protein-Coupled Receptors

Author: Chaok Seok (665007)
Gyu Rie Lee (665005)
Hahnbeom Park (665004)
Jonghun Won (5346785)
Publication venue
Publication date
Field of study

The second extracellular loops (ECL2s) of G-protein-coupled receptors (GPCRs) are often involved in GPCR functions, and their structures have important implications in drug discovery. However, structure prediction of ECL2 is difficult because of its long length and the structural diversity among different GPCRs. In this study, a new ECL2 conformational sampling method involving both template-based and ab initio sampling was developed. Inspired by the observation of similar ECL2 structures of closely related GPCRs, a template-based sampling method employing loop structure templates selected from the structure database was developed. A new metric for evaluating similarity of the target loop to templates was introduced for template selection. An ab initio loop sampling method was also developed to treat cases without highly similar templates. The ab initio method is based on the previously developed fragment assembly and loop closure method. A new sampling component that takes advantage of secondary structure prediction was added. In addition, a conserved disulfide bridge restraining ECL2 conformation was predicted and analytically incorporated into sampling, reducing the effective dimension of the conformational search space. The sampling method was combined with an existing energy function for comparison with previously reported loop structure prediction methods, and the benchmark test demonstrated outstanding performance

FigShare

Protein Loop Modeling Using a New Hybrid Energy Function and Its Application to Modeling in Inaccurate Structural Environments

Author: Chaok Seok (665007)
Gyu Rie Lee (665005)
Hahnbeom Park (665004)
Lim Heo (665006)
Publication venue
Publication date: 24/11/2014
Field of study

<div>Protein loop modeling is a tool for predicting protein local structures of particular interest, providing opportunities for applications involving protein structure prediction and de novo protein design. Until recently, the majority of loop modeling methods have been developed and tested by reconstructing loops in frameworks of experimentally resolved structures. In many practical applications, however, the protein loops to be modeled are located in inaccurate structural environments. These include loops in model structures, low-resolution experimental structures, or experimental structures of different functional forms. Accordingly, discrepancies in the accuracy of the structural environment assumed in development of the method and that in practical applications present additional challenges to modern loop modeling methods. This study demonstrates a new strategy for employing a hybrid energy function combining physics-based and knowledge-based components to help tackle this challenge. The hybrid energy function is designed to combine the strengths of each energy component, simultaneously maintaining accurate loop structure prediction in a high-resolution framework structure and tolerating minor environmental errors in low-resolution structures. A loop modeling method based on global optimization of this new energy function is tested on loop targets situated in different levels of environmental errors, ranging from experimental structures to structures perturbed in backbone as well as side chains and template-based model structures. The new method performs comparably to force field-based approaches in loop reconstruction in crystal structures and better in loop prediction in inaccurate framework structures. This result suggests that higher-accuracy predictions would be possible for a broader range of applications. The web server for this method is available at <a href="http://galaxy.seoklab.org/loop" target="_blank">http://galaxy.seoklab.org/loop</a> with the PS2 option for the scoring function.</div

Directory of Open Access Journals

PubMed Central

FigShare

Examples of loops modeled in inaccurate environmental structures.

Author: Chaok Seok (665007)
Gyu Rie Lee (665005)
Hahnbeom Park (665004)
Lim Heo (665006)
Publication venue
Publication date
Field of study

In all panels, the crystal structures are colored in green and the models in magenta. Framework structures are shown transparent for clarity. (A) Two examples of tolerating errors in surrounding side chains, 1oyc (left; RMSD = 0.4 Å) and 1c5e (right; RMSD = 0.5 Å). The loop-framework salt bridges in the crystal structures are indicated with black dotted lines. High-accuracy modeling is possible even though the salt bridges cannot be recovered owing to the perturbed arginine orientations in the framework. (B) An example of unsuccessful modeling in the framework of perturbed side-chains, 1oth (RMSD = 2.3 Å), showing the necessity of additional sampling. The perturbed Arg66 and Tyr345 side chains (magenta) would clash with the two leucine residues in the loop if the crystal loop structure were to be placed. (C) Two examples of tolerating additional backbone errors, 1my7 (left; RMSD = 1.0 Å) and 1cb0 (right; RMSD = 0.9 Å). The overall backbone trace and key side-chain interactions are well reproduced.</p

FigShare

A successful example of loop modeling in the framework of a template-based model.

Author: Chaok Seok (665007)
Gyu Rie Lee (665005)
Hahnbeom Park (665004)
Lim Heo (665006)
Publication venue
Publication date
Field of study

The crystal structure is colored in green and the model in magenta (1avk, RMSD = 1.5 Å). Framework structures are shown transparent for clarity. Loops of three templates (used for template-based modeling) are shown with yellow transparent ribbons for comparison.</p

FigShare

Distributions of environmental errors for the three types of test sets employed in the study.

Author: Chaok Seok (665007)
Gyu Rie Lee (665005)
Hahnbeom Park (665004)
Lim Heo (665006)
Publication venue
Publication date
Field of study

(A) for the test set of crystal structures with perturbed side chains, (B) for the crystal structures with both backbone and side chains perturbed, and (C) for the template-based models. The gray curve behind the histogram represents an interpolation. The average E-RMSD values are 0.9 Å, 2.1 Å, and 2.8 Å for the side chain-perturbed set (A), the backbone-perturbed set (B), and the template-based model set (C), respectively. E-RMSD represents the all-atom RMSD of environment residues for which any atoms are within 10 Å from any loop Cβ atoms.</p

FigShare

Sampling results of GalaxyLoop-PS2 on the three test sets.

Author: Chaok Seok (665007)
Gyu Rie Lee (665005)
Hahnbeom Park (665004)
Lim Heo (665006)
Publication venue
Publication date
Field of study

1)Number of loop targets for which at least one structure among the 30 loop conformations (or 50 conformations for 12-residue loops) in the final CSA bank is within a given RMSD value.Sampling results of GalaxyLoop-PS2 on the three test sets.</p

FigShare

Comparison of loop modeling results on the test set of template-based models.

Author: Chaok Seok (665007)
Gyu Rie Lee (665005)
Hahnbeom Park (665004)
Lim Heo (665006)
Publication venue
Publication date
Field of study

The average RMSD and its standard deviation are reported in Å. The Loop RMSD is calculated as the root-mean-square deviation of the main-chain atoms N, Cα, C, and O.1)Loop conformations generated by MODELLER <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0113811#pone.0113811-Sali1" target="_blank">[30]</a>.2)Loop conformations generated by loop refinement using ModLoop of MODELLER <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0113811#pone.0113811-Fiser1" target="_blank">[1]</a>, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0113811#pone.0113811-Fiser2" target="_blank">[27]</a>.3)Results of the best-score models sampled by Next-generation KIC (NGK) using the protocol provided by Stein et al.<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0113811#pone.0113811-Stein1" target="_blank">[18]</a>.500 models were generated for each target as in Stein et al. The Rosetta program v3.5 was used.4)Loop set constructed in this study. See <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0113811#pone.0113811.s008" target="_blank">Table S7</a> for the list of loops.Comparison of loop modeling results on the test set of template-based models.</p

FigShare

Protein Ensemble Generation Through Variational Autoencoder Latent Space Sampling

Author: David Baker (11642)
Gyu Rie Lee (9252371)
Hahnbeom Park (665004)
Minkyung Baek (6115892)
Sanaa Mansoor (5175041)
Publication venue
Publication date: 09/04/2024
Field of study

Mapping the ensemble of protein conformations that contribute to function and can be targeted by small molecule drugs remains an outstanding challenge. Here, we explore the use of variational autoencoders for reducing the challenge of dimensionality in the protein structure ensemble generation problem. We convert high-dimensional protein structural data into a continuous, low-dimensional representation, carry out a search in this space guided by a structure quality metric, and then use RoseTTAFold guided by the sampled structural information to generate 3D structures. We use this approach to generate ensembles for the cancer relevant protein K-Ras, train the VAE on a subset of the available K-Ras crystal structures and MD simulation snapshots, and assess the extent of sampling close to crystal structures withheld from training. We find that our latent space sampling procedure rapidly generates ensembles with high structural quality and is able to sample within 1 Å of held-out crystal structures, with a consistency higher than that of MD simulation or AlphaFold2 prediction. The sampled structures sufficiently recapitulate the cryptic pockets in the held-out K-Ras structures to allow for small molecule docking

FigShare

Comparison of loop modeling results by the average RMSD of main chain atoms (N, Cα, C, and O) of loops in angstroms (Å) on test sets of varying environmental accuracies measured by E-RMSD.

Author: Chaok Seok (665007)
Gyu Rie Lee (665005)
Hahnbeom Park (665004)
Lim Heo (665006)
Publication venue
Publication date
Field of study

Standard deviations are also reported.1)Loop sampling methods sample only the loop region, while extended sampling methods sample surrounding side chains in addition to the loop.2)Taken from Sellers et al.<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0113811#pone.0113811-Sellers1" target="_blank">[15]</a>.3)Taken from Mandell et al.<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0113811#pone.0113811-Mandell1" target="_blank">[16]</a>.4)Results of the best-score models out of 500 models sampled for each target following the protocol provided by Stein et al.<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0113811#pone.0113811-Stein1" target="_blank">[18]</a> with Rosetta v3.5.The results for the crystal structure set and the side chain-perturbed set are the same for NGK because extended sampling of loop environment was used for both sets.5)Loop sets taken from Jacobson et al.<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0113811#pone.0113811-Jacobson1" target="_blank">[10]</a>. See <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0113811#pone.0113811.s002" target="_blank">Tables S1</a> and <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0113811#pone.0113811.s003" target="_blank">S2</a> for the list of loops.6)Loop sets from Zhu et al.<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0113811#pone.0113811-Zhu1" target="_blank">[34]</a>. See <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0113811#pone.0113811.s002" target="_blank">Tables S1</a> and <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0113811#pone.0113811.s003" target="_blank">S2</a> for the list of loops.7)Loop set from Fiser et al.<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0113811#pone.0113811-Fiser1" target="_blank">[1]</a>. See <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0113811#pone.0113811.s004" target="_blank">Tables S3</a> for the list of loops.Comparison of loop modeling results by the average RMSD of main chain atoms (N, Cα, C, and O) of loops in angstroms (Å) on test sets of varying environmental accuracies measured by E-RMSD.</p

FigShare

Simultaneous Optimization of Biomolecular Energy Functions on Features from Small Molecules and Macromolecules

Author: David Baker (11642)
David E. Kim (2723062)
Frank DiMaio (216772)
Hahnbeom Park (665004)
Per Greisen (1849972)
Philip Bradley (134084)
Vikram Khipple Mulligan (3333285)
Yuan Liu (88411)
Publication venue
Publication date
Field of study

Most biomolecular modeling energy functions for structure prediction, sequence design, and molecular docking have been parametrized using existing macromolecular structural data; this contrasts molecular mechanics force fields which are largely optimized using small-molecule data. In this study, we describe an integrated method that enables optimization of a biomolecular modeling energy function simultaneously against small-molecule thermodynamic data and high-resolution macromolecular structural data. We use this approach to develop a next-generation Rosetta energy function that utilizes a new anisotropic implicit solvation model, and an improved electrostatics and Lennard-Jones model, illustrating how energy functions can be considerably improved in their ability to describe large-scale energy landscapes by incorporating both small-molecule and macromolecule data. The energy function improves performance in a wide range of protein structure prediction challenges, including monomeric structure prediction, protein–protein and protein–ligand docking, protein sequence design, and prediction of the free energy changes by mutation, while reasonably recapitulating small-molecule thermodynamic properties

FigShare

GalaxyGPCRloop: Template-Based and <i>Ab Initio</i> Structure Sampling of the Extracellular Loops of G‑Protein-Coupled Receptors

Protein Loop Modeling Using a New Hybrid Energy Function and Its Application to Modeling in Inaccurate Structural Environments

Examples of loops modeled in inaccurate environmental structures.

A successful example of loop modeling in the framework of a template-based model.

Distributions of environmental errors for the three types of test sets employed in the study.

Sampling results of GalaxyLoop-PS2 on the three test sets.

Comparison of loop modeling results on the test set of template-based models.

Protein Ensemble Generation Through Variational Autoencoder Latent Space Sampling

Comparison of loop modeling results by the average RMSD of main chain atoms (N, C<sub>α</sub>, C, and O) of loops in angstroms (Å) on test sets of varying environmental accuracies measured by E-RMSD.

Simultaneous Optimization of Biomolecular Energy Functions on Features from Small Molecules and Macromolecules