634 research outputs found

    DrugScorePPI webserver: fast and accurate in silico alanine scanning for scoring protein–protein interactions

    Get PDF
    Protein–protein complexes play key roles in all cellular signal transduction processes. We have developed a fast and accurate computational approach to predict changes in the binding free energy upon alanine mutations in protein–protein interfaces. The approach is based on a knowledge-based scoring function, DrugScorePPI, for which pair potentials were derived from 851 complex structures and adapted against 309 experimental alanine scanning results. Based on this approach, we developed the DrugScorePPI webserver. The input consists of a protein–protein complex structure; the output is a summary table and bar plot of binding free energy differences for wild-type residue-to-Ala mutations. The results of the analysis are mapped on the protein–protein complex structure and visualized using J mol. A single interface can be analyzed within a few minutes. Our approach has been successfully validated by application to an external test set of 22 alanine mutations in the interface of Ras/RalGDS. The DrugScorePPI webserver is primarily intended for identifying hotspot residues in protein–protein interfaces, which provides valuable information for guiding biological experiments and in the development of protein–protein interaction modulators. The DrugScorePPI Webserver, accessible at http://cpclab.uni-duesseldorf.de/dsppi, is free and open to all users with no login requirement

    Rigorous assessment and integration of the sequence and structure based features to predict hot spots

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Systematic mutagenesis studies have shown that only a few interface residues termed hot spots contribute significantly to the binding free energy of protein-protein interactions. Therefore, hot spots prediction becomes increasingly important for well understanding the essence of proteins interactions and helping narrow down the search space for drug design. Currently many computational methods have been developed by proposing different features. However comparative assessment of these features and furthermore effective and accurate methods are still in pressing need.</p> <p>Results</p> <p>In this study, we first comprehensively collect the features to discriminate hot spots and non-hot spots and analyze their distributions. We find that hot spots have lower relASA and larger relative change in ASA, suggesting hot spots tend to be protected from bulk solvent. In addition, hot spots have more contacts including hydrogen bonds, salt bridges, and atomic contacts, which favor complexes formation. Interestingly, we find that conservation score and sequence entropy are not significantly different between hot spots and non-hot spots in Ab+ dataset (all complexes). While in Ab- dataset (antigen-antibody complexes are excluded), there are significant differences in two features between hot pots and non-hot spots. Secondly, we explore the predictive ability for each feature and the combinations of features by support vector machines (SVMs). The results indicate that sequence-based feature outperforms other combinations of features with reasonable accuracy, with a precision of 0.69, a recall of 0.68, an F1 score of 0.68, and an AUC of 0.68 on independent test set. Compared with other machine learning methods and two energy-based approaches, our approach achieves the best performance. Moreover, we demonstrate the applicability of our method to predict hot spots of two protein complexes.</p> <p>Conclusion</p> <p>Experimental results show that support vector machine classifiers are quite effective in predicting hot spots based on sequence features. Hot spots cannot be fully predicted through simple analysis based on physicochemical characteristics, but there is reason to believe that integration of features and machine learning methods can remarkably improve the predictive performance for hot spots.</p

    SARS-CoV-2 중화 항체와 키누레닌 효소에 대한 분자 동역학 연구

    Get PDF
    학위논문(석사) -- 서울대학교대학원 : 자연과학대학 화학부, 2023. 8. 석차옥.The dynamic role of proteins in vivo has become perceived increasingly important in the field of therapeutics. Therapeutic proteins possess specific functions that contribute to disease alleviation when they interact with specific disease target molecules: invoking immune responses, catalyzing biochemical reactions, transporting molecules, and assembling into membranes without interfering with other biological pathways. However, limited understanding of protein-protein and protein-ligand interactions still hinders effective development of protein therapeutics. In this thesis, molecular interactions occurring in functional proteins, such as antibodies and enzymes, are investigated with a focus on binding thermodynamics and kinetics, respectively. This is achieved through atomic-level molecular dynamics simulations and statistical analysis of residue-wise binding free energies and residence times. First, the simultaneous formation of multiple contacts between antibodies and their target proteins (SARS-CoV-2 RBD) was observed to contributes to favorable binding affinity. Second, -interactions, facilitated by hydrogen bonds formed between residues in enzyme and its substrate, was found to stabilize the binding pose and to improve binding kinetics upon substrate binding, resulting in longer residence times. These studies provide enhanced understanding of detailed atomic contributions in molecular interactions of therapeutic proteins and hence, new strategies for improved design of protein therapeutics.생체 내에서 단백질의 동적 역할이 단백질 치료제 분야에서 매우 중요하게 다뤄지고 있다. 단백질은 다른 생물학적 경로를 방해하지 않으면서 면역 반응 유도, 생화학적 반응 촉매화, 분자 운송, 그리고 막단백질 형성 등 질병 완화에 특정하면서도 다양한 역할을 한다. 그러나 단백질-단백질과 단백질-리간드 상호작용에 대한 제한된 지식이 단백질 치료제의 발전을 어렵게 만든다. 본 논문에서는 항체나 효소와 같은 기능성 단백질의 분자 상호작용을 결합 열역학과 운동학의 관점에서 연구한다. 본 연구는 원자 단위의 분자동역학 시뮬레이션과 단백질 결합 구조의 잔기별 결합 자유 에너지 계산과 단백질에 결합된 리간드의 체류 시간 계산 결과를 통계적으로 분석하여 수행되었다. 먼저, 항체와 표적 단백질 사이에 동시에 다중의 상호작용이 형성될 때 결합 친화도가 좋아짐을 보인다. 그리고 효소 잔기와 리간드 사이에 형성되는 수소 결합에 의해 발생하는 -상호작용이 안정적인 결합 구조를 유지하며, 이것이 효소와 리간드 결합 구조의 긴 체류 시간과 결합 운동학 향상에 영향을 미침을 보인다. 이러한 연구를 통해 우리는 단백질에서 일어나는 분자 상호작용에 대한 이해도를 높이고, 나아가 단백질 치료제 디자인 향상에 새로운 전략을 제시하는 데에 기여함을 목표로 한다.ABSTRACT ⅰ TABLE OF CONTENTS ⅲ LIST OF FIGURES ⅵ LIST OF TABLES ⅸ 1. INTRODUCTION 1 2. Atomic-level thermodynamic analysis of the binding free energy of SARS-CoV-2 neutralizing antibodies 3 2.1. Introduction 3 2.2. Methods 6 2.2.1. System preparation 6 2.2.2. Molecular dynamics simulations 12 2.2.3. Effective binding energy calculations 12 2.3. Results and Discussion 14 2.3.1. Overall trends in effective binding energy 14 2.3.2. Connection between effective binding energy and molecular interactions 17 2.3.3. Why simultaneous multiple interactions are thermodynamically crucial? 27 2.4. Conclusions 32 3. Substrate Selectivity of Human and Pseudomonas Kynureninase: Mechanistic Insight from Molecular Dynamics Simulation 33 3.1. Introduction 33 3.2. Methods 36 3.2.1. System preparation 36 3.2.2. Molecular dynamics simulations 36 3.2.3. Trajectory analysis 38 3.3. Results and Discussion 39 3.3.1. Hydrogen bond analysis of the human and pseudomonas kynureninase complexes 39 3.3.2. Impact of hydrogen bonds for mediating -interactions 46 3.3.3. Role of residence time in different catalytic activities of human and pseudomonas kynureninases 48 3.4. Conclusions 52 4. CONCLUSION 53 BIBLIOGRAPHY 54 국문초록 64석

    Rigorous assessment and integration of the sequence and structure based features to predict hot spots

    Get PDF
    Background Systematic mutagenesis studies have shown that only a few interface residues termed hot spots contribute significantly to the binding free energy of protein-protein interactions. Therefore, hot spots prediction becomes increasingly important for well understanding the essence of proteins interactions and helping narrow down the search space for drug design. Currently many computational methods have been developed by proposing different features. However comparative assessment of these features and furthermore effective and accurate methods are still in pressing need. Results In this study, we first comprehensively collect the features to discriminate hot spots and non-hot spots and analyze their distributions. We find that hot spots have lower relASA and larger relative change in ASA, suggesting hot spots tend to be protected from bulk solvent. In addition, hot spots have more contacts including hydrogen bonds, salt bridges, and atomic contacts, which favor complexes formation. Interestingly, we find that conservation score and sequence entropy are not significantly different between hot spots and non-hot spots in Ab+ dataset (all complexes). While in Ab- dataset (antigen-antibody complexes are excluded), there are significant differences in two features between hot pots and non-hot spots. Secondly, we explore the predictive ability for each feature and the combinations of features by support vector machines (SVMs). The results indicate that sequence-based feature outperforms other combinations of features with reasonable accuracy, with a precision of 0.69, a recall of 0.68, an F1 score of 0.68, and an AUC of 0.68 on independent test set. Compared with other machine learning methods and two energy-based approaches, our approach achieves the best performance. Moreover, we demonstrate the applicability of our method to predict hot spots of two protein complexes. Conclusion Experimental results show that support vector machine classifiers are quite effective in predicting hot spots based on sequence features. Hot spots cannot be fully predicted through simple analysis based on physicochemical characteristics, but there is reason to believe that integration of features and machine learning methods can remarkably improve the predictive performance for hot spots

    SKEMPI 2.0: an updated benchmark of changes in protein–protein binding energy, kinetics and thermodynamics upon mutation

    Get PDF
    Motivation: Understanding the relationship between the sequence, structure, binding energy, binding kinetics and binding thermodynamics of protein–protein interactions is crucial to understanding cellular signaling, the assembly and regulation of molecular complexes, the mechanisms through which mutations lead to disease, and protein engineering. Results: We present SKEMPI 2.0, a major update to our database of binding free energy changes upon mutation for structurally resolved protein–protein interactions. This version now contains manually curated binding data for 7085 mutations, an increase of 133%, including changes in kinetics for 1844 mutations, enthalpy and entropy changes for 443 mutations, and 440 mutations, which abolish detectable binding.This work has been supported by the European Molecular Biology Laboratory [I.H.M.]; Biotechnology and Biological Sciences Research Council [Future Leader Fellowship BB/N011600/1 to I.H.M.]; Spanish Ministry of Economy and Competitiveness (MINECO) [BIO2016-79930-R to J.F.R.]; Interreg POCTEFA [EFA086/15 to J.F.R.]; European Commission [H2020 grant 676556 (MuG)].Peer ReviewedPostprint (published version

    Using machine-learning-driven approaches to boost hot-spot's knowledge

    Get PDF
    Understanding protein–protein interactions (PPIs) is fundamental to describe and to characterize the formation of biomolecular assemblies, and to establish the energetic principles underlying biological networks. One key aspect of these interfaces is the existence and prevalence of hot-spots (HS) residues that, upon mutation to alanine, negatively impact the formation of such protein–protein complexes. HS have been widely considered in research, both in case studies and in a few large-scale predictive approaches. This review aims to present the current knowledge on PPIs, providing a detailed understanding of the microspecifications of the residues involved in those interactions and the characteristics of those defined as HS through a thorough assessment of related field-specific methodologies. We explore recent accurate artificial intelligence-based techniques, which are progressively replacing well-established classical energy-based methodologies. This article is categorized under: Data Science > Databases and Expert Systems Structure and Mechanism > Computational Biochemistry and Biophysics Molecular and Statistical Mechanics > Molecular Interactions

    Predicting structural and energetic effects of mutations at protein-protein interfaces

    No full text
    Understanding the structural, dynamical and energetic basis of protein-protein interactions (PPIs) is key for a number of research disciplines. Predicting which sites in PPIs show potential for modulation with binding free energy (ΔG) calculations allows experimental work to be targeted and inhibitors to be rationally designed. However, PPIs remain a challenging target for computational free energy calculations due to their large and complex interfaces. A number of different methods for predicting ΔG from molecular dynamics simulations have been developed, yet each suffers from unique problems in its potential for widespread implementation across PPIs. This thesis initially evaluates the efficacy of the existing MM-PB/GBSA free energy calculation techniques and notes a niche for the improvement of the methods’ predictive power. This is followed by the development of a new computational method for predicting the effects of any PPI interface mutation, which we term Mutational Locally Enhanced Sampling (MULES). MULES generates atomistic molecular dynamics trajectories of native and mutant protein complexes simultaneously. These trajectories are then used to calculate relative binding free energies (ΔΔG) between the two complexes, investigating both structural and energetic effects of individual amino acids at an interface. In principle MULES allows the effect of any mutation to be calculated. Initially tested against a prototypical set of mutations with experimentally measured ΔΔG, MULES showed significantly improved accuracy in ΔΔG prediction and high precision and speed compared to existing methods. The approach was further validated on a large and diverse dataset of approximately 60 individual mutations, comparing results to experimental data and other computational predictions. Validation provided additional evidence for the improved accuracy, precision, speed and particularly versatility of the technique, but also identified areas for improvement. The successes and limitations of MULES discovered here will be of interest to the protein design, drug discovery and computational chemical biology communities

    APIS: accurate prediction of hot spots in protein interfaces by combining protrusion index with solvent accessibility

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>It is well known that most of the binding free energy of protein interaction is contributed by a few key hot spot residues. These residues are crucial for understanding the function of proteins and studying their interactions. Experimental hot spots detection methods such as alanine scanning mutagenesis are not applicable on a large scale since they are time consuming and expensive. Therefore, reliable and efficient computational methods for identifying hot spots are greatly desired and urgently required.</p> <p>Results</p> <p>In this work, we introduce an efficient approach that uses support vector machine (SVM) to predict hot spot residues in protein interfaces. We systematically investigate a wide variety of 62 features from a combination of protein sequence and structure information. Then, to remove redundant and irrelevant features and improve the prediction performance, feature selection is employed using the F-score method. Based on the selected features, nine individual-feature based predictors are developed to identify hot spots using SVMs. Furthermore, a new ensemble classifier, namely APIS (A combined model based on Protrusion Index and Solvent accessibility), is developed to further improve the prediction accuracy. The results on two benchmark datasets, ASEdb and BID, show that this proposed method yields significantly better prediction accuracy than those previously published in the literature. In addition, we also demonstrate the predictive power of our proposed method by modelling two protein complexes: the calmodulin/myosin light chain kinase complex and the heat shock locus gene products U and V complex, which indicate that our method can identify more hot spots in these two complexes compared with other state-of-the-art methods.</p> <p>Conclusion</p> <p>We have developed an accurate prediction model for hot spot residues, given the structure of a protein complex. A major contribution of this study is to propose several new features based on the protrusion index of amino acid residues, which has been shown to significantly improve the prediction performance of hot spots. Moreover, we identify a compact and useful feature subset that has an important implication for identifying hot spot residues. Our results indicate that these features are more effective than the conventional evolutionary conservation, pairwise residue potentials and other traditional features considered previously, and that the combination of our and traditional features may support the creation of a discriminative feature set for efficient prediction of hot spot residues. The data and source code are available on web site <url>http://home.ustc.edu.cn/~jfxia/hotspot.html</url>.</p

    Reliable in silico ranking of engineered therapeutic TCR binding affinities with MMPB/GBSA

    Get PDF
    Accurate and efficient in silico ranking of proteinprotein binding affinities is useful for protein design with applications in biological therapeutics. One popular approach to rank binding affinities is to apply the molecular mechanics Poisson-Boltzmann/generalized Born surface area (MMPB/ GBSA) method to molecular dynamics (MD) trajectories. Here, we identify protocols that enable the reliable evaluation of T-cell receptor (TCR) variants binding to their target, peptide-human leukocyte antigens (pHLAs). We suggest different protocols for variant sets with a few (&lt;= 4) or many mutations, with entropy corrections important for the latter. We demonstrate how potential outliers could be identified in advance and that just 5-10 replicas of short (4 ns) MD simulations may be sufficient for the reproducible and accurate ranking of TCR variants. The protocols developed here can be applied toward in silico screening during the optimization of therapeutic TCRs, potentially reducing both the cost and time taken for biologic development
    corecore