4 research outputs found

    ๋‹จ๋ฐฑ์งˆ ๊ตฌ์กฐ ์ธ๊ทผ์˜ ๋ฌผ ๋ถ„์ž ์œ„์น˜ ์˜ˆ์ธก ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ์—ฐ๊ตฌ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ์ž์—ฐ๊ณผํ•™๋Œ€ํ•™ ํ™”ํ•™๋ถ€, 2022. 8. ์„์ฐจ์˜ฅ.Most proteins in the living cell function in an aqueous solution, and protein molecules interact closely with water molecules. These interactions play critical roles in determining the structure and physiological function of proteins. Methods for predicting the structure or interaction of proteins consider the interaction between protein and water either implicitly or explicitly. Typical implicit solvent models consider protein-water interaction by treating solvent as a continuous dielectric medium. Such models can effectively evaluate the important electrostatic interactions with much cheaper computational costs than simulating proteins in explicit water by molecular dynamics simulation. Therefore, implicit water models are employed for protein structure prediction and docking, unlike molecular dynamic simulations. However, implicit models do not consider specific, short-range, orientation-dependent hydrogen bonds between water and protein molecules. Specific hydrogen bond interactions with water molecules are known to be involved in the structure and function of some proteins. Therefore, it is essential to consider such water molecules explicitly for detailed description and accurate prediction of protein structure and function even in the framework of implicit solvent models. 3D-RISM is an elegant statistical mechanical method that can predict essential water molecules making specific interactions with a given protein structure using a molecular mechanics force field and an integral equation theory. In this thesis, two methods for predicting water positions on a given protein structure are introduced. The first method is based on a new statistical potential that describes interactions between protein atoms and water molecules. The potential was derived from protein structures experimentally resolved with water molecules. A crucial part of the potential that distinguishes from other conventional potentials is consideration of the solvation environment of protein atoms during statistical derivation. This method is about 180 faster than the method based on 3D-RISM and has similar or higher performance. Further performance improvement was achieved by adopting a machine learning approach. This method trained a convolutional neural network (CNN) on experimentally resolved structures to recognize structural patterns that favor water-binding on the protein surfaces. This method is about 44 times faster than 3D-RISM when GPGPU was used. Furthermore, the performance of locating water molecules at protein-protein interfaces and protein-ligand binding sites is also improved compared to other existing methods๋Œ€๋ถ€๋ถ„์˜ ์ƒ์ฒด ๋‹จ๋ฐฑ์งˆ์€ ์ˆ˜์šฉ์•ก ์ƒํƒœ์—์„œ ์กด์žฌํ•˜๋ฉฐ, ๋‹จ๋ฐฑ์งˆ ๋ถ„์ž๋Š” ๋ฌผ ๋ถ„์ž์™€ ๋งŽ์€ ์ƒํ˜ธ์ž‘์šฉ์„ ์ผ์œผํ‚จ๋‹ค. ์ด๋Ÿฌํ•œ ์ƒํ˜ธ์ž‘์šฉ์€ ๋‹จ๋ฐฑ์งˆ์˜ ๊ตฌ์กฐ๋‚˜ ๊ธฐ๋Šฅ์— ์ค‘์š”ํ•œ ์—ญํ• ์„ ํ•œ๋‹ค. ๋”ฐ๋ผ์„œ ๋‹จ๋ฐฑ์งˆ์˜ ๊ตฌ์กฐ์™€ ๊ธฐ๋Šฅ์„ ์˜ˆ์ธกํ•˜๋Š” ๋ฐฉ๋ฒ•๋“ค์€ ๋‹จ๋ฐฑ์งˆ๊ณผ ๋ฌผ ๋ถ„์ž ์‚ฌ์ด์˜ ์ƒํ˜ธ์ž‘์šฉ์„ ์ง, ๊ฐ„์ ‘์ ์œผ๋กœ ๊ณ ๋ คํ•˜๊ฒŒ ๋œ๋‹ค. ๊ฐ„์ ‘์ ์œผ๋กœ ๋ฌผ๊ณผ ๋‹จ๋ฐฑ์งˆ ๋ถ„์ž ์‚ฌ์ด์˜ ์ƒํ˜ธ์ž‘์šฉ์„ ๊ณ ๋ คํ•˜๋Š” ๋ฐฉ๋ฒ•์œผ๋กœ๋Š” ๋ฌผ์„ ์ผ์ข…์˜ ์œ ์ „์ฒด๋กœ ๊ฐ€์ •ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ•˜๋Š”๋ฐ, ์ด๋Ÿฌํ•œ ๋ฐฉ๋ฒ•์€ ๊ฐ๊ฐ์˜ ๋ฌผ ๋ถ„์ž์˜ ์œ„์น˜๋ฅผ ๊ณ ๋ คํ•  ํ•„์š”๊ฐ€ ์—†๊ธฐ ๋•Œ๋ฌธ์— ๋น„๊ต์  ๊ณ„์‚ฐ ๋น„์šฉ์ด ๋‚ฎ๊ณ , ๋ฌผ๊ณผ ๋‹จ๋ฐฑ์งˆ ๋ถ„์ž ์‚ฌ์ด์˜ ์ƒํ˜ธ์ž‘์šฉ ์ค‘ ๋งŽ์€ ๋ถ€๋ถ„์„ ์ฐจ์ง€ํ•˜๋Š” ์ •์ „๊ธฐ์  ์ƒํ˜ธ์ž‘์šฉ์„ ๋ชจ์‚ฌํ•  ์ˆ˜๋Š” ์žˆ์ง€๋งŒ, ๋ฌผ ๋ถ„์ž์˜ ์œ„์น˜์— ๋”ฐ๋ผ ํฌ๊ฒŒ ๋‹ฌ๋ผ์งˆ ์ˆ˜ ์žˆ๋Š” ๋ฌผ๊ณผ ๋‹จ๋ฐฑ์งˆ ์‚ฌ์ด์˜ ์ˆ˜์†Œ๊ฒฐํ•ฉ๊ณผ ๊ฐ™์€ ๊ทผ๊ฑฐ๋ฆฌ ์ƒํ˜ธ์ž‘์šฉ์„ ๋ชจ์‚ฌํ•˜๊ธฐ ์–ด๋ ต๋‹ค๋Š” ๋ฌธ์ œ์ ์ด ์žˆ๋‹ค. ํŠนํžˆ ๋ฌผ๊ณผ ๋‹จ๋ฐฑ์งˆ ๋ถ„์ž ์‚ฌ์ด์˜ ๊ทผ๊ฑฐ๋ฆฌ ์ƒํ˜ธ์ž‘์šฉ์€ ๋‹จ๋ฐฑ์งˆ์˜ ๊ธฐ๋Šฅ์— ์˜ํ–ฅ์„ ๋ผ์น˜๊ธฐ ๋•Œ๋ฌธ์— ๋‹จ๋ฐฑ์งˆ์˜ ๊ธฐ๋Šฅ์„ ์˜ˆ์ธกํ•˜๋Š” ๋ฐฉ๋ฒ•์—์„œ๋Š” ๋‹จ๋ฐฑ์งˆ๊ณผ ๊ทผ๊ฑฐ๋ฆฌ ์ƒํ˜ธ์ž‘์šฉ์„ ํ•  ๊ฐ€๋Šฅ์„ฑ์ด ๋†’์€ ๋ฌผ ๋ถ„์ž๋“ค์˜ ์œ„์น˜์™€ ๋‹จ๋ฐฑ์งˆ๊ณผ์˜ ์ƒํ˜ธ์ž‘์šฉ์„ ์˜ˆ์ธกํ•˜๋Š” ๊ฒƒ์ด ์ค‘์š”ํ•  ์ˆ˜ ์žˆ๋‹ค. ๋ฌผ๊ณผ ๋‹จ๋ฐฑ์งˆ ๋ถ„์ž ์‚ฌ์ด์˜ ๊ทผ๊ฑฐ๋ฆฌ ์ƒํ˜ธ์ž‘์šฉ์„ ๊ณ ๋ คํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ๋ฌผ ๋ถ„์ž์˜ ์œ„์น˜๋ฅผ ์ง์ ‘์ ์œผ๋กœ ๋ฐ˜์˜ํ•˜์—ฌ ๋ฌผ๊ณผ ๋‹จ๋ฐฑ์งˆ ์‚ฌ์ด์˜ ์ƒํ˜ธ์ž‘์šฉ์„ ๋ชจ์‚ฌํ•˜๋ฉฐ, ์ฃผ๋กœ ๋ถ„์ž๋™์—ญํ•™ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์ด๋‚˜ 3D-RISM์ด ์‚ฌ์šฉ๋œ๋‹ค. ์ด๋Ÿฌํ•œ ๋ฐฉ๋ฒ•๋“ค์€ ๋ฌผ๊ณผ ๋‹จ๋ฐฑ์งˆ ์‚ฌ์ด์˜ ์ƒํ˜ธ์ž‘์šฉ์„ ๋”์šฑ ์ž์„ธํ•˜๊ฒŒ ๋ชจ์‚ฌํ•  ์ˆ˜ ์žˆ์ง€๋งŒ ๊ณ„์‚ฐ๋น„์šฉ์ด ๋†’๋‹ค๋Š” ๋ฌธ์ œ๊ฐ€ ์žˆ์œผ๋ฉฐ, ๋‹จ๋ฐฑ์งˆ๊ณผ ๋ฌผ ์‚ฌ์ด์˜ ์ƒํ˜ธ์ž‘์šฉ์— ์ƒ๋‹นํ•œ ๊ธฐ์—ฌ๋ฅผ ํ•˜๋Š” ๋‹จ๋ฐฑ์งˆ์— ๊ฒฐํ•ฉ๋œ ๋ฌผ์˜ ์œ„์น˜๋ฅผ ์ž˜ ์˜ˆ์ธกํ•˜์ง€ ๋ชปํ•œ๋‹ค๋Š” ๋ฌธ์ œ๋„ ์กด์žฌํ•œ๋‹ค. ๋”ฐ๋ผ์„œ, ๋ณธ ํ•™์œ„ ๋…ผ๋ฌธ์—์„œ๋Š” ๋‹จ๋ฐฑ์งˆ ์ฃผ๋ณ€์˜ ๋ฌผ ๋ถ„์ž์˜ ์œ„์น˜๋ฅผ ์˜ˆ์ธกํ•˜๋Š” 2๊ฐ€์ง€์˜ ๋ฐฉ๋ฒ•์„ ์ œ์‹œํ•˜์˜€๋‹ค. ์ฒซ๋ฒˆ์งธ ์‹œ๋„๋Š” ๋‹จ๋ฐฑ์งˆ์„ ๊ตฌ์„ฑํ•˜๋Š” ์›์ž์˜ ์šฉ๋งคํ™” ์ƒํƒœ๋ฅผ ๊ณ ๋ คํ•˜์—ฌ ๋ฌผ๊ณผ ๋‹จ๋ฐฑ์งˆ ์‚ฌ์ด์˜ ํ†ต๊ณ„๊ธฐ๋ฐ˜ ํฌํ…์…œ ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•˜์—ฌ ๋‹จ๋ฐฑ์งˆ ์ฃผ๋ณ€์˜ ๋ฌผ์˜ ์œ„์น˜๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ๋ฐฉ๋ฒ•์ด์—ˆ๋‹ค. ์ด ๋ฐฉ๋ฒ•์€ 3D-RISM ๋ฐฉ๋ฒ•์— ๋น„ํ•ด์„œ ํ‰๊ท ์ ์œผ๋กœ 180๋ฐฐ์˜ ๊ณ„์‚ฐ ์†๋„ ํ–ฅ์ƒ์„ ๋ณด์—ฌ์ฃผ์—ˆ์œผ๋ฉฐ, ๋‹จ๋ฐฑ์งˆ์— ๊ฒฐํ•ฉ๋œ ๋ฌผ ๋ถ„์ž์˜ ์œ„์น˜๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ์„ฑ๋Šฅ์€ 3D-RISM๊ณผ ๋น„์Šทํ•˜๊ฑฐ๋‚˜ ๋” ๋†’์•˜๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์ด ๋ฐฉ๋ฒ•์€ ์ˆ˜์†Œ๊ฒฐํ•ฉ์— ์ง์ ‘์ ์œผ๋กœ ์ฐธ์—ฌํ•˜์ง€ ์•Š๋Š” ๋‹จ๋ฐฑ์งˆ ์›์ž์™€ ๋ฌผ ๋ถ„์ž ์‚ฌ์ด์˜ ํฌํ…์…œ ์šฐ๋ฌผ์„ ๋งŒ๋“ค์–ด์ง€๋Š” ํ˜„์ƒ์ด ์กด์žฌํ•˜์˜€๊ธฐ ๋•Œ๋ฌธ์— ์ œํ•œ๋œ ์˜ˆ์ธก ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ์—ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋ฌธ์ œ๋กœ ์ธํ•˜์—ฌ ๋ฌผ ๋ถ„์ž๋ฅผ ์ˆ˜์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๋‹จ๋ฐฑ์งˆ์˜ ๊ตฌ์กฐ ํŒจํ„ด์„ ์ธ์‹ํ•  ์ˆ˜ ์žˆ๋Š” Convolutional neural network๋ฅผ ์ด์šฉํ•œ ๋ฌผ ๋ถ„์ž ์œ„์น˜ ์˜ˆ์ธก ๋ฐฉ๋ฒ•์„ ๋งŒ๋“ค์—ˆ๊ณ , ํ†ต๊ณ„ ๊ธฐ๋ฐ˜ ํฌํ…์…œ ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•œ ๋ฌผ ๋ถ„์ž ์œ„์น˜ ์˜ˆ์ธก ๋ฐฉ๋ฒ•์— ๋น„ํ•ด ๋”์šฑ ๋†’์€ ์˜ˆ์ธก ์„ฑ๋Šฅ์„ ๋ณด์˜€๋‹ค. ์ด ๋ฐฉ๋ฒ•์€ GPGPU๋ฅผ ์‚ฌ์šฉํ•˜์˜€์„ ๊ฒฝ์šฐ, 3D-RISM์„ ์‚ฌ์šฉํ•œ ๋ฐฉ๋ฒ•์— ๋น„ํ•ด 44๋ฐฐ์˜ ์†๋„ ํ–ฅ์ƒ์„ ๋ณด์˜€๊ณ , CPU๋งŒ์„ ์‚ฌ์šฉํ–ˆ์„ ๋•Œ์—๋„ 58%์˜ ์†๋„ ํ–ฅ์ƒ์„ ๋ณด์˜€๋‹ค. ์˜ˆ์ธก ์„ฑ๋Šฅ์˜ ๊ฒฝ์šฐ, ๋‹จ๋ฐฑ์งˆ ๋ถ„์ž์˜ ๊ฒฐ์ • ๊ตฌ์กฐ์— ํฌํ•จ๋œ ๋ฌผ ๋ถ„์ž์˜ ์ˆ˜์˜ 3๋ฐฐ์˜ ๋ฌผ ๋ถ„์ž์˜ ์œ„์น˜๋ฅผ ์˜ˆ์ธกํ–ˆ์„ ๋•Œ, ์˜ˆ์ธก๋œ ์œ„์น˜๊ฐ€ ๊ฒฐ์ • ๊ตฌ์กฐ์— ์กด์žฌํ•˜๋Š” ๋ฌผ ๋ถ„์ž์˜ ์œ„์น˜์˜ 1โ„ซ ์ด๋‚ด์— ์žˆ์„ ํ™•๋ฅ ์ด 75% ์ด์ƒ์ด์—ˆ๋‹ค. ์ด ๋…ผ๋ฌธ์—์„œ ์ œ์‹œ๋œ ๋ฐฉ๋ฒ•๋“ค์„ ์ด์šฉํ•˜์—ฌ ๋‹จ๋ฐฑ์งˆ ์ฃผ๋ณ€์˜ ๋ฌผ์˜ ์œ„์น˜๋ฅผ ๋” ์ •ํ™•ํžˆ ์˜ˆ์ธกํ•  ์ˆ˜ ์žˆ๋‹ค. ๋‚˜์•„๊ฐ€์„œ ๋‹จ๋ฐฑ์งˆ-๋ฆฌ๊ฐ„๋“œ ๋„ํ‚น์„ ํ•  ๋•Œ, ๋‹จ๋ฐฑ์งˆ์— ๋ถ™์žกํ˜€์žˆ๋Š” ๋ฌผ ๋ถ„์ž์˜ ์œ„์น˜๋ฅผ ๊ณ ๋ คํ•˜์—ฌ ๋”์šฑ ๋‹จ๋ฐฑ์งˆ-๋ฆฌ๊ฐ„๋“œ ๋„ํ‚น์„ ํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ์œผ๋กœ ์˜ˆ์ƒ๋œ๋‹ค.1. INTRODUCTION 1 2. Prediction of Water Positions on Protein Structure Using wKGB Statistical Potential 5 2.1. Methods 5 2.1.1. Derivation of wKGB potential 5 2.1.2. Prediction of bound water positions with wKGB 11 2.2. Results and Discussion 13 2.2.1. Characteristics of wKGB potential 13 2.2.2. Results of water site prediction 18 3. Prediction of Water Positions on Protein Structure using 3D-CNN 23 3.1. Methods 24 3.1.1. Overview of the overall method 24 3.1.2. The CNN architecture 25 3.1.3. Training of the neural network 28 3.1.3.1. Training set proteins and complexes 28 3.1.3.2. Training method 29 3.1.4. Placement of water molecules from the water map 29 3.1.5. Methods for performance comparison 30 3.1.5.1. Evaluation measures 30 3.1.5.2. Test sets 31 3.1.5.3. Running other methods for comparison 32 3.2. Results and Discussion 34 3.2.1. Results of network training 34 3.2.2. Results on the single-protein test set 35 3.2.3. Results on the protein-protein complex test set 39 3.2.4. Result on the protein-compound complex set 41 4. CONCLUSION 44 SUPPLEMENTARY INFORMATION 46 BIBLIOGRAPHY 56 ๊ตญ๋ฌธ์ดˆ๋ก 60๋ฐ•

    Computational Modeling of (De)-Solvation Effects and Protein Flexibility in Protein-Ligand Binding using Molecular Dynamics Simulations

    Get PDF
    Water is a crucial participant in virtually all cellular functions. Evidently, water molecules in the binding site contribute significantly to the strength of intermolecular interactions in the aqueous phase by mediating protein-ligand interactions, solvating and de-solvating both ligand and protein upon protein-ligand dissociation and association. Recently many published studies use water distributions in the binding site to retrospectively explain and rationalize unexpected trends in structure-activity relationships (SAR). However, traditional approaches cannot quantitatively predict the thermodynamic properties of water molecules in the binding sites and its associated contribution to the binding free energy of a ligand. We have developed and validated a computational method named WATsite to exploit high-resolution solvation maps and thermodynamic profiles to elucidate the water moleculesโ€™ potential contribution to protein-ligand and protein-protein binding. We have also demonstrated the utility of the computational method WATsite to help direct medicinal chemistry efforts by using explicit water de-solvation. In addition, protein conformational change is typically involved in the ligand-binding process which may completely change the position and thermodynamic properties of the water molecules in the binding site before or upon ligand binding. We have shown the interplay between protein flexibility and solvent reorganization, and we provide a quantitative estimation of the influence of protein flexibility on desolvation free energy and, therefore, protein-ligand binding. Different ligands binding to the same target protein can induce different conformational adaptations. In order to apply WATsite to an ensemble of different protein conformations, a more efficient implementation of WATsite based on GPU-acceleration and system truncation has been developed. Lastly, by extending the simulation protocol from pure water to mixed water-organic probes simulations, accurate modeling of halogen atom-protein interactions has been achieved

    Analysis of Factors Influencing Hydration Site Prediction Based on Molecular Dynamics Simulations

    No full text
    Water contributes significantly to the binding of small molecules to proteins in biochemical systems. Molecular dynamics (MD) simulation based programs such as WaterMap and WATsite have been used to probe the locations and thermodynamic properties of hydration sites at the surface or in the binding site of proteins generating important information for structure-based drug design. However, questions associated with the influence of the simulation protocol on hydration site analysis remain. In this study, we use WATsite to investigate the influence of factors such as simulation length and variations in initial protein conformations on hydration site prediction. We find that 4 ns MD simulation is appropriate to obtain a reliable prediction of the locations and thermodynamic properties of hydration sites. In addition, hydration site prediction can be largely affected by the initial protein conformations used for MD simulations. Here, we provide a first quantification of this effect and further indicate that similar conformations of binding site residues (RMSD < 0.5 ร…) are required to obtain consistent hydration site predictions
    corecore