9 research outputs found

    The performance of ensemble-based free energy protocols in computing binding affinities to ROS1 kinase

    Get PDF
    Optimization of binding affinities for compounds to their target protein is a primary objective in drug discovery. Herein we report on a collaborative study that evaluates a set of compounds binding to ROS1 kinase. We use ESMACS (enhanced sampling of molecular dynamics with approximation of continuum solvent) and TIES (thermodynamic integration with enhanced sampling) protocols to rank the binding free energies. The predicted binding free energies from ESMACS simulations show good correlations with experimental data for subsets of the compounds. Consistent binding free energy differences are generated for TIES and ESMACS. Although an unexplained overestimation exists, we obtain excellent statistical rankings across the set of compounds from the TIES protocol, with a Pearson correlation coefficient of 0.90 between calculated and experimental activities

    In Silico Simulation of DUSP-YIV906 Protein-Ligand Interactions and DUSP3-ERK Protein-Peptide Interactions

    Get PDF
    The dual-specificity phosphatases (DUSPs) are a heterogeneous group of protein enzymes that modulate several critical cellular signaling pathways by dephosphorylating the phosphotyrosine and phosphoserine/phosphothreonine residues within their substrate proteins. One of the best characterized sub-group of DUSPs is the mitogen-activated protein kinase phosphatases (MKPs), which act as the antagonists of associated signaling cascades including the extracellular signal-regulated kinases (ERKs) pathways. Accumulated evidences have highlighted the therapeutic value of DUSPs, as deletion or inhibition of some DUSPs can increase the phosphorylated level of ERKs to cause cancer cell death. In this study, multi-scale molecular modeling simulations were first performed to investigate the mechanism of action of YIV-906, which is an herbal formulation used in cancer treatment targeting DUSP-ERK1/2 pathways. In total, MD simulations and binding free energy calculations were performed for 99 DUSP-ligand complexes. Our results demonstrate that the sulfate moieties and carboxyl moieties of the advantageous ligands, either original herbal chemicals or human metabolites from YIV-906, can occupy the enzymes’ catalytic sites, mimicking the endogenous phosphate substrates of DUSPs. With the second aim to improve the accuracy of protein-peptide docking between DUSP3 and a peptide fragment of ERK1/2, a new receptor residue mapping (RR mapping) algorithm was developed to identify hotspots residues on the surface of DUSP3 and improve the peptide docking scoring. By performing all-atom molecular dynamics (MD) simulations with the receptor being soaked in a water box containing 0.5 moles of capped dipeptide of 20 natural amino acids (AA) plus 3 phosphorylated non-standard AAs, the RR maps with probabilities of AAs interacting with DUSP3’s surface residues were obtained. With the interaction probabilities incorporated, the ERK peptide binding models yielded by protein-peptide docking can be re-ranked and generate more accurate predictions. We have demonstrated that multi-scale molecular modeling techniques are able to elucidate molecular mechanisms involving complex molecular systems. Finally, our modeling study provides useful insights into the rational design of high potent anti-cancer drugs targeting DUSPs, and the new RR mapping algorithm is a promising tool that can be universally applied in characterization of protein-protein interactions (PPIs)

    Develop and Test a Solvent Accessible Surface Area-Based Model in Conformational Entropy Calculations

    No full text
    It is of great interest in modern drug design to accurately calculate the free energies of protein–ligand or nucleic acid–ligand binding. MM-PBSA (molecular mechanics Poisson–Boltzmann surface area) and MM-GBSA (molecular mechanics generalized Born surface area) have gained popularity in this field. For both methods, the conformational entropy, which is usually calculated through normal-mode analysis (NMA), is needed to calculate the absolute binding free energies. Unfortunately, NMA is computationally demanding and becomes a bottleneck of the MM-PB/GBSA-NMA methods. In this work, we have developed a fast approach to estimate the conformational entropy based upon solvent accessible surface area calculations. In our approach, the conformational entropy of a molecule, <i>S</i>, can be obtained by summing up the contributions of all atoms, no matter they are buried or exposed. Each atom has two types of surface areas, solvent accessible surface area (SAS) and buried SAS (BSAS). The two types of surface areas are weighted to estimate the contribution of an atom to <i>S</i>. Atoms having the same atom type share the same weight and a general parameter <i>k</i> is applied to balance the contributions of the two types of surface areas. This entropy model was parametrized using a large set of small molecules for which their conformational entropies were calculated at the B3LYP/6-31G* level taking the solvent effect into account. The weighted solvent accessible surface area (WSAS) model was extensively evaluated in three tests. For convenience, <i><i>TS</i></i> values, the product of temperature <i>T</i> and conformational entropy <i>S</i>, were calculated in those tests. <i>T</i> was always set to 298.15 K through the text. First of all, good correlations were achieved between WSAS <i>TS</i> and NMA <i>TS</i> for 44 protein or nucleic acid systems sampled with molecular dynamics simulations (10 snapshots were collected for postentropy calculations): the mean correlation coefficient squares (<i>R</i><sup>2</sup>) was 0.56. As to the 20 complexes, the <i>TS</i> changes upon binding; <i>T</i>Δ<i>S</i> values were also calculated, and the mean <i>R</i><sup>2</sup> was 0.67 between NMA and WSAS. In the second test, <i>TS</i> values were calculated for 12 proteins decoy sets (each set has 31 conformations) generated by the Rosetta software package. Again, good correlations were achieved for all decoy sets: the mean, maximum, and minimum of <i>R</i><sup>2</sup> were 0.73, 0.89, and 0.55, respectively. Finally, binding free energies were calculated for 6 protein systems (the numbers of inhibitors range from 4 to 18) using four scoring functions. Compared to the measured binding free energies, the mean <i>R</i><sup>2</sup> of the six protein systems were 0.51, 0.47, 0.40, and 0.43 for MM-GBSA-WSAS, MM-GBSA-NMA, MM-PBSA-WSAS, and MM-PBSA-NMA, respectively. The mean rms errors of prediction were 1.19, 1.24, 1.41, 1.29 kcal/mol for the four scoring functions, correspondingly. Therefore, the two scoring functions employing WSAS achieved a comparable prediction performance to that of the scoring functions using NMA. It should be emphasized that no minimization was performed prior to the WSAS calculation in the last test. Although WSAS is not as rigorous as physical models such as quasi-harmonic analysis and thermodynamic integration (TI), it is computationally very efficient as only surface area calculation is involved and no structural minimization is required. Moreover, WSAS has achieved a comparable performance to normal-mode analysis. We expect that this model could find its applications in the fields like high throughput screening (HTS), molecular docking, and rational protein design. In those fields, efficiency is crucial since there are a large number of compounds, docking poses, or protein models to be evaluated. A list of acronyms and abbreviations used in this work is provided for quick reference

    Develop and Test a Solvent Accessible Surface Area-Based Model in Conformational Entropy Calculations

    No full text
    It is of great interest in modern drug design to accurately calculate the free energies of protein–ligand or nucleic acid–ligand binding. MM-PBSA (molecular mechanics Poisson–Boltzmann surface area) and MM-GBSA (molecular mechanics generalized Born surface area) have gained popularity in this field. For both methods, the conformational entropy, which is usually calculated through normal-mode analysis (NMA), is needed to calculate the absolute binding free energies. Unfortunately, NMA is computationally demanding and becomes a bottleneck of the MM-PB/GBSA-NMA methods. In this work, we have developed a fast approach to estimate the conformational entropy based upon solvent accessible surface area calculations. In our approach, the conformational entropy of a molecule, <i>S</i>, can be obtained by summing up the contributions of all atoms, no matter they are buried or exposed. Each atom has two types of surface areas, solvent accessible surface area (SAS) and buried SAS (BSAS). The two types of surface areas are weighted to estimate the contribution of an atom to <i>S</i>. Atoms having the same atom type share the same weight and a general parameter <i>k</i> is applied to balance the contributions of the two types of surface areas. This entropy model was parametrized using a large set of small molecules for which their conformational entropies were calculated at the B3LYP/6-31G* level taking the solvent effect into account. The weighted solvent accessible surface area (WSAS) model was extensively evaluated in three tests. For convenience, <i><i>TS</i></i> values, the product of temperature <i>T</i> and conformational entropy <i>S</i>, were calculated in those tests. <i>T</i> was always set to 298.15 K through the text. First of all, good correlations were achieved between WSAS <i>TS</i> and NMA <i>TS</i> for 44 protein or nucleic acid systems sampled with molecular dynamics simulations (10 snapshots were collected for postentropy calculations): the mean correlation coefficient squares (<i>R</i><sup>2</sup>) was 0.56. As to the 20 complexes, the <i>TS</i> changes upon binding; <i>T</i>Δ<i>S</i> values were also calculated, and the mean <i>R</i><sup>2</sup> was 0.67 between NMA and WSAS. In the second test, <i>TS</i> values were calculated for 12 proteins decoy sets (each set has 31 conformations) generated by the Rosetta software package. Again, good correlations were achieved for all decoy sets: the mean, maximum, and minimum of <i>R</i><sup>2</sup> were 0.73, 0.89, and 0.55, respectively. Finally, binding free energies were calculated for 6 protein systems (the numbers of inhibitors range from 4 to 18) using four scoring functions. Compared to the measured binding free energies, the mean <i>R</i><sup>2</sup> of the six protein systems were 0.51, 0.47, 0.40, and 0.43 for MM-GBSA-WSAS, MM-GBSA-NMA, MM-PBSA-WSAS, and MM-PBSA-NMA, respectively. The mean rms errors of prediction were 1.19, 1.24, 1.41, 1.29 kcal/mol for the four scoring functions, correspondingly. Therefore, the two scoring functions employing WSAS achieved a comparable prediction performance to that of the scoring functions using NMA. It should be emphasized that no minimization was performed prior to the WSAS calculation in the last test. Although WSAS is not as rigorous as physical models such as quasi-harmonic analysis and thermodynamic integration (TI), it is computationally very efficient as only surface area calculation is involved and no structural minimization is required. Moreover, WSAS has achieved a comparable performance to normal-mode analysis. We expect that this model could find its applications in the fields like high throughput screening (HTS), molecular docking, and rational protein design. In those fields, efficiency is crucial since there are a large number of compounds, docking poses, or protein models to be evaluated. A list of acronyms and abbreviations used in this work is provided for quick reference

    Binding free energy calculations and molecular dynamics simulations on complexes of viral proteases with their ligands

    Get PDF
    Ein Ziel der biomolekularen Modellierung ist die Berechnung der Affinität deltaG von Liganden an Proteine, insbesondere Enzyme. Das Spektrum der Methoden, die zu diesem Zweck entwickelt wurden, reicht von theoretisch genauen aber aufwändigen Verfahren zu einfachen, eher qualitativen Verfahren. Während letztere häufig empirische Scoring-Funktionen und eine einzelne Struktur als Eingabe verwenden, wird für kompliziertere Methoden der möglichst vollständige Konformationsraum eines Protein-Ligand-Komplexes benötigt. Dieser wird mit Sampling-Verfahren wie der Molekulardynamik (MD) durchmustert. In dieser Promotionsarbeit sollten Verfahren zur Berechnung von deltaG, insbesondere Varianten der Molecular Mechanics Poisson-Boltzmann Surface Area (MMPBSA) Methode, getestet und nach Möglichkeit weiterentwickelt werden. Desweiteren sollte die Auswirkung bestimmter Resistenzmutationen auf Struktur und Dynamik von Proteinen mit unterschiedlichen Maßen aus MD Simulationen heraus erfasst werden. Der erste Schritt der quantitativen Modellierung mit MD ist die Beschreibung der Moleküle durch die Parametrisierung eines Kraftfelds. Anhand des sulfatierten Tyrosins wurde eine solche molekulare Parametrisierung für ein Nicht-Standard-Molekül durchgeführt. Sodann wurden Varianten der tendenziell weniger aufwändigen MMPBSA-Methode getestet im Hinblick auf ihre Konvergenz und ihre Eignung zur Bestimmung genauer deltaG-Werte oder zumindest verschiedene Enzym-Ligand-Komplexe in eine richtige Rangfolge gemäß ihrer deltaG-Werte zu bringen. Die Varianten unterscheiden sich durch verschiedene Solvatisierungsmodelle und Methoden zur Berechnung der Entropie. Als molekulares Referenzsystem wurden Mutanten der HIV Protease im Komplex mit Wirkstoffen verwendet, da es hierzu experimentelle Daten gibt, mit denen die berechneten Werte verglichen werden können. Am anderen Ende des methodischen Spektrums liegt die aufwändige Thermodynamische Integration (TI). Bei einer guten Kraftfeldparametrisierung sollte TI in der Lage sein, deltaG-Effekte in der Größenordnung weniger kJ/mol quantitativ zu bestimmen. Dies wurde anhand der Mutante L76V der HIVProtease, die für einige Wirkstoffe zu einer Resensitivierung (erhöhte Affinität) führt, getestet. Schließlich sollten MD-Simulationen verwendet werden, um die molekularen Effekte von Mutationen der NS3/4A-Protease des humanen Hepatitis C Virus auf die Bindung von Liganden (Substrat, Inhibitoren) zu verstehen.A major aim of biomolecular modelling is the calculation of binding affinities deltaG of ligands to proteins, especially enzymes. The spectrum of methods that has been developed for this task ranges from theoretically exact but expensive to more simple and qualitative ones. While the latter are often empirical scoring functions using one single structure as an input, the more complex methods require the preferably complete conformational space of a protein-ligand complex which can be sampled using methods such as molecular dynamics (MD). The intention of this thesis was to test and further develop methods for the calculation of deltaG, in particular variants of the molecular mechanics Poisson-Boltzmann surface area (MMPBSA) method. Furthermore, the effects of specific resistance mutations on the structure and dynamics of proteins should be determined using different metrics on MD simulation data. The first step to quantitative modelling using MD is the description of the molecules by parameterizing a forcefield. Such a molecular parameterization was performed for the non-standard amino acid sulpho-tyrosine. Subsequently, variants of the less expensive MMPBSA method were tested with regard to their ability to converge and determine deltaG estimates or at least establish the correct ranking of deltaG values for a set of enzyme-ligand complexes. Different solvation models and procedures to calculate the entropy have been used. As a molecular reference system, mutants of the HIV protease complexed with inhibitors were used. For these systems, experimental data are available to which the calculated values can be compared. At the other end of the methodological spectrum is the more expensive thermodynamic integration (TI). With a proper forcefield parameterization, TI should be able to quantitatively determine deltaG effects in the order of a few kJ/mol. This was tested on the HIV protease mutation L76V which is known to lead to a resensitivation (increased affinity) for some drugs. Eventually, MD simulations were used to understand the molecular effects of mutations of the NS3/4A protease, an enzyme of the human hepatitis C virus, on the binding of ligands (substrate, inhibitors)

    Machine Learning based Protein Sequence to (un)Structure Mapping and Interaction Prediction

    Get PDF
    Proteins are the fundamental macromolecules within a cell that carry out most of the biological functions. The computational study of protein structure and its functions, using machine learning and data analytics, is elemental in advancing the life-science research due to the fast-growing biological data and the extensive complexities involved in their analyses towards discovering meaningful insights. Mapping of protein’s primary sequence is not only limited to its structure, we extend that to its disordered component known as Intrinsically Disordered Proteins or Regions in proteins (IDPs/IDRs), and hence the involved dynamics, which help us explain complex interaction within a cell that is otherwise obscured. The objective of this dissertation is to develop machine learning based effective tools to predict disordered protein, its properties and dynamics, and interaction paradigm by systematically mining and analyzing large-scale biological data. In this dissertation, we propose a robust framework to predict disordered proteins given only sequence information, using an optimized SVM with RBF kernel. Through appropriate reasoning, we highlight the structure-like behavior of IDPs in disease-associated complexes. Further, we develop a fast and effective predictor of Accessible Surface Area (ASA) of protein residues, a useful structural property that defines protein’s exposure to partners, using regularized regression with 3rd-degree polynomial kernel function and genetic algorithm. As a key outcome of this research, we then introduce a novel method to extract position specific energy (PSEE) of protein residues by modeling the pairwise thermodynamic interactions and hydrophobic effect. PSEE is found to be an effective feature in identifying the enthalpy-gain of the folded state of a protein and otherwise the neutral state of the unstructured proteins. Moreover, we study the peptide-protein transient interactions that involve the induced folding of short peptides through disorder-to-order conformational changes to bind to an appropriate partner. A suite of predictors is developed to identify the residue-patterns of Peptide-Recognition Domains from protein sequence that can recognize and bind to the peptide-motifs and phospho-peptides with post-translational-modifications (PTMs) of amino acid, responsible for critical human diseases, using the stacked generalization ensemble technique. The involved biologically relevant case-studies demonstrate possibilities of discovering new knowledge using the developed tools

    Therapeutic strategy to end Tuberculosis (TB) world: structural and functional characterization of potential weak hotspots of Mycobacterium tuberculosis molecular targets from combinatorial in silico perspective.

    Get PDF
    Doctoral Degree. University of KwaZulu-Natal, Durban.The world has witnessed several decades of Tuberculosis (TB) pandemic and numerous advanced scientific efforts to control the invasiveness of the newly evolving Mycobacterium tuberculosis strains (Mtb) resulting in drug resistance. TB disease has killed hundreds of millions of humans and left millions maimed that need to be rehabilitated; about 10.0 million infections and 1.5 million annually in the last decade. Drug-resistant TB has remained more challenging in the previous 20 years than drug-susceptible TB and is chromosomal mutations-associated in selected genes of the Mtb. Notable mutations identified by biomarkers are related to phenotypic drug resistance, and these include; an 81 bp region in rpoB gene with > 95 % mutations in rifampicin (RIF) clinical isolates and katG gene promoter of the mabA-inhA showed to be associated with INH-resistance. Different Strategies, including the recent WHO End TB approach, have been employed to alleviate or stop TB. The current identification of the critical roles of Mtb demethylmenaquinone methyltransferase (menG) target in the survival, pathogenesis, virulence, and drug resistance created an avenue for the development of efficacious therapeutics that can eradicate TB. MenG is a member of the methyltransferase superfamily. It catalyzes one of the last steps of the menaquinone biosynthesis pathway, requires for maintenance of the Mtb cell envelope. The other two studied targets investigated in this work are N-acetylglucosamine-6-phosphate deacetylase enzyme (NagA), which represents a critical enzymatic step in the production of essential amino sugar required by Mtb for the cell wall biosynthesis and the secreted antigen 85C enzyme (Ag85C) target. The latter target catalyzes the synthesis of trehalose derivatives and attachment of mycolic acids. These targets have gained considerable attention in drug discovery pipelines. However, there is little information about menG, as it lasks structural dynamics due to the lack of crystal structure, active site regions, and amino acids of it Mycobacteria homologs. Similarly, the dynamics of the NagA and Ag85C proteins structure are still unknown. Therefore, justifications led to the modelling of the 3D Structure of menG to understand the structural and functional features that could be investigated at the atomistic level. Homology models were also created for the five (5) mycobacterial homologs. Furthermore, the inevitable need for new drugs has led to the application of in silico techniques including molecular modelling and molecular dynamics simulations, which provide opportunities for the chemists to evaluate and assess numerous compounds that can lead to potential drugs against the mycobacterial disease. Furthermore, these computational techniques justify the present incorporation of several computational tools integrated into this study to provide insights into the conformational changes that illuminate potential inhibitory mechanism, identification of the binding site amino acids, and characterization. Here, we analyze the weak hotspots dynamics specific to each of the Mtb targets, most notably the loop and active residues around or within the ligand-binding sites to obtain useful findings for the design of higher efficacious potential antitubercular drugs. Molecular dynamics simulations were performed to gain molecular standpoints of the conformational binding of the experimental drugs, which were reported to be highly effective against each respective target. Structural dynamics and motions behaviour of menG upon the binding of inhibitor (DG70, biphenyl amide compound) were estimated. Additional in silico thermodynamic analyses were further employed to explore intuitions into the binding mode of each inhibitor mainly for the proposed binding site of menG to identify the residues for binding. Sequence analysis of the homologs of Mycobacterium tuberculosis NagA and Ag85C targets, including those of smegmatis, marinum, leprae, ulcerans, were performed to obtain unique sequence similarities and differences and the structural and functional characterization upon the binding of the ligand. An experimental protocol let to the discovery of a selective covalent inhibitor, β- isomer monocyclic enolphosphorus Cycliphostin, of Ag85C SER-124. Moreover, chapter 4 also unravels the impact of the function of the non-synonymous single nucleotide polymorphisms of NagA target. The desired expectation is that the implementation of the information extricated from this study would provide the structural silhouette for pharmaceutical scientists and molecular biologists to abet in the identification and design of novel antimycobacterial drugs most especially for TB

    Rapid, precise and reproducible binding affinity prediction: applications in drug discovery

    Get PDF
    As we move towards an era of personalised medicine, the identification of lead compounds requires years of research and considerable financial backing, in the development of targeted therapies for cancer. We use molecular modelling and simulation to screen a library of active compounds, and understand the ligand-protein interaction at the molecular level in appropriate protein targets, in a bid to identify the most active lead drug candidates. In recent times, good progress has been made in accurately predicting binding affinities for drug candidates. Advances in high-performance computation (HPC), mean it is now possible to run a larger number of calculations in parallel, paving the way for multiple replica simulations from which binding affinities are obtained. This, then, allows for a tighter control of errors and in turn, a higher confidence in the binding affinity predictions. Here, we present ESMACS (Enhanced Sampling of Molecular dynamics with Approximation of Continuum Solvent) and TIES (Thermodynamic Integration with Enhanced Sampling); a new framework from which binding affinities are calculated. ESMACS performs 25 replica simulations of the same ligand-receptor system with the only difference being the initial momentum of each atom. From this ensemble of trajectories, an extended MMPBSA (Molecular Mechanics Poisson-Boltzmann Surface Area) free energy method is employed. The TIES protocol constitutes 5 replicas simulations per lambda state followed by the integration of the potential derivatives of each lambda state, generating a relative binding affinity. This is all tied together using the BAC (Binding Affinity Calculator) which automates the ESMACS and TIES workflow. ESMACS and TIES, given suitable access to HPC resources, can compute binding affinities in a matter of hours on a supercomputer; the size of such machines therefore means that we can reach the industrial scale of demand necessary to impact drug discovery programmes
    corecore