128 research outputs found

    Molecular Docking Improvement: Coefficient Adaptive Genetic Algorithms for Multiple Scoring Functions

    Get PDF
    In this paper, a coefficient adaptive scoring method of molecular docking is presented to improve the docking accuracy with multiple available scoring functions. Based on force-field scoring function, we considered hydrophobic and deformation as well in the proposed method, Instead of simple combination with fixed weights, coefficients of each factor are adaptive in searching procedure. In order to improve the docking accuracy and stability, knowledge-based scoring function is used as another scoring factor. Genetic algorithm with the multi-population evolution and entropy-based searching technique with narrowing down space is used to solve the optimization model for molecular docking. To evaluate the method, we carried out a numerical experiment with 134 protein- ligand complexes of the publicly available GOLD test set. The results validated that it improved the docking accuracy over the individual force-field scoring. In addition, analyses were given to show the disadvantage of individual scoring model. Through the comparison with other popular docking software, the proposed method showed higher accuracy. Among more than 77% of the complexes, the docked results were within 1.0 Å according to Root- Mean-Square Deviation (RMSD) of the X-ray structure. The average computing time obtained here is 563.9 s

    Advances and Challenges in Protein-Ligand Docking

    Get PDF
    Molecular docking is a widely-used computational tool for the study of molecular recognition, which aims to predict the binding mode and binding affinity of a complex formed by two or more constituent molecules with known structures. An important type of molecular docking is protein-ligand docking because of its therapeutic applications in modern structure-based drug design. Here, we review the recent advances of protein flexibility, ligand sampling, and scoring functions—the three important aspects in protein-ligand docking. Challenges and possible future directions are discussed in the Conclusion

    Applying Computational Scoring Functions to Assess Biomolecular Interactions in Food Science: Applications to the Estrogen Receptors

    Get PDF
    During the last decade, computational methods, which were for the most part developed to study protein-ligand interactions and especially to discover, design and develop drugs by and for medicinal chemists, have been successfully applied in a variety of food science applications [1,2]. It is now clear, in fact, that drugs and nutritional molecules behave in the same way when binding to a macromolecular target or receptor, and that many of the approaches used so extensively in medicinal chemistry can be easily transferred to the fields of food science. For instance, nuclear receptors are common targets for a number of drug molecules and could be, in the same way, affected by the interaction with food or food-like molecules. Thus, key computational medicinal chemistry methods like molecular dynamics can be used to decipher protein flexibility and to obtain stable models for docking and scoring in food-related studies, and virtual screening is increasingly being applied to identify molecules with potential to act as endocrine disruptors, food mycotoxins, and new nutraceuticals [3,4,5]. All of these methods and simulations are based on protein-ligand interaction phenomena, and represent the basis for any subsequent modification of the targeted receptor's or enzyme's physiological activity. We describe here the energetics of binding of biological complexes, providing a survey of the most common and successful algorithms used in evaluating these energetics, and we report case studies in which computational techniques have been applied to food science issues. In particular, we explore a handful of studies involving the estrogen receptors for which we have a long-term interest

    Investigating Peptide/RNA binding in Anti-HIV research by molecular simulations: electrostatic recognition and accelerated sampling

    Get PDF
    Studying protein/RNA binding is of great biological and pharmaceutical importance. In the past two decades, RNA has gained growing attention in biomedical and pharmaceutical research due to its key roles in gene replication and expression [1, 2]. From a pharmaceutical point of view, the advantages of targeting RNA over the conventional protein targets include slower drug-resistance development, more selective inhibition, and lower cytotoxicity. Targeting RNA is, however, more challenging than targeting proteins. Designing RNA-binding drugs is limited by the lack of medicinal chemistry studies on RNA and the poor understanding of ligand/RNA molecular recognition mechanisms..

    Estimation of binding free energies with Monte Carlo atomistic simulations and enhanced sampling

    Get PDF
    The advances in computing power have motivated the hope that computational methods can accelerate the pace of drug discovery pipelines. For this, fast, reliable and user-friendly tools are required. One of the fields that has gotten more attentions is the prediction of binding affinities. Two main problems have been identified for such methods: insufficient sampling and inaccurate models. This thesis is focused on tackling the first problem. To this end, we present the development of efficient methods for the estimation of protein-ligand binding free energies. We have developed a protocol that combines enhanced sampling with more standard simulations methods to achieve higher efficiency. First, we run an exploratory enhanced sampling simulation, starting from the bound conformation and partially biased towards unbound poses. The we leverage the information gained from this short simulation to run, longer unbiased simulations to collect statistics. Thanks to the modularity and automation that the protocol offers we were able to test three different methods for the long simulations: PELE, molecular dynamics and AdaptivePELE. PELE and molecular dynamics showed similar results, although PELE used less computational resources. Both seemed to work well with small protein-fragment systems or proteins with not very flexible binding sites. Both failed to accurately reproduce the binding of a kinase, the Mitogen-activated protein kinase 1 (ERK2). On the other hand, AdaptivePELE did not show a great improvement over PELE, with positive results for the Urokinase-type plasminogen activator (URO) and a clear lack of sampling for the Progesterone receptor (PR). We demonstrated the importance of well-designed suite of test systems for the development of new methods. Through the use of a diverse benchmark of protein systems we have established the cases in which the protocol is expected to give accurate results, and which areas require further development. This benchmark consisted of four proteins, and over 30 ligands, much larger than the test systems typically used in the development of pathway-based free energy methods. In summary, the methodology developed in this work can contribute to the drug discovery process for a limited range of protein systems. For many other, we have observed that regular unbiased simulations are not efficient enough and more sophisticated, enhanced sampling methods are required.Els grans avenços en la capacitat de computació han motivat l'esperança que els mètodes de simulacions per ordinador puguin accelerar el ritme de descobriment de nous fàrmacs. Per a què això sigui possible, es necessiten eines ràpides, acurades i fàcils d'utilitzar. Un dels problemes que han rebut més atenció és el de la predicció d'energies lliures d'unió entre proteïna i lligand. Dos grans problemes han estat identificats per a aquests mètodes: la falta de mostreig i les aproximacions dels models. Aquesta tesi està enfocada a resoldre el primer problema. Per a això, presentem el desenvolupament de mètodes eficients per a l'estimació de d'energies lliures d'unió entre proteïna i lligand. Hem desenvolupat un protocol que combina mètodes anomenats enhanced sampling amb simulació clàssiques per a obtenir una major eficiència. Els mètodes d'enhanced sampling són una classe d'eines que apliquen algun tipus de pertorbació externa al sistema que s'està estudiant per tal d'accelerar-ne el mostreig. En el nostre protocol, primer correm una simulació exploratòria d'enhanced sampling, començant per una mostra de la unió de la proteïna i el lligand. Aquesta simulació esta parcialment esbiaixada cap a aquells estats del sistema on els dos components es troben més separats. Després utilitzem la informació obtinguda d'aquesta primera simulació més curta per a córrer una segona simulació més llarga, amb mètodes sense biaix per obtenir una estadística fidedigna del sistema. Gràcies a la modularitat i el grau d'automatització que la implementació del protocol ofereix, hem pogut provar tres mètodes diferents per les simulacions llargues: PELE, dinàmica molecular i AdaptivePELE. PELE i dinàmica molecular han mostrat resultats similars, tot i que PELE utilitza menys recursos. Els dos han mostrat bons resultats en l'estudi de sistemes de fragments o amb proteïnes amb llocs d'unió poc flexibles. Però, els dos han fallat a l'hora de reproduir els resultats experimentals per a una quinasa, la Mitogen-activated protein kinase 1 (ERK2). D'altra banda, AdaptivePELE no ha mostrat una gran millora respecte a PELE, amb resultats positius per a la proteïna Urokinase-type plasminogen activator (URO) i una clara falta de mostreig per al receptor de progesterona (PR). En aquest treball hem demostrat la importància d'establir un banc de proves equilibrat durant el desenvolupament de nous mètodes. Mitjançant l'ús d'un banc de proves divers hem pogut establir en quins casos es pot esperar que el protocol obtingui resultats acurats, i quines àrees necessiten més desenvolupament. El banc de proves ha consistit de quatre proteïnes i més de trenta lligands, molt més dels que comunament s'utilitzen en el desenvolupament de mètodes per a la predicció d'energies d'unió mitjançant mètodes basats en camins (pathway-based). En resum, la metodologia desenvolupada durant aquesta tesi pot contribuir al procés de recerca de nous fàrmacs per a certs tipus de sistemes de proteïnes. Per a la resta, hem observat que els mètodes de simulació no esbiaixats no són prou eficients i tècniques més sofisticades són necessàries

    Estimation of binding free energies with Monte Carlo atomistic simulations and enhanced sampling

    Get PDF
    The advances in computing power have motivated the hope that computational methods can accelerate the pace of drug discovery pipelines. For this, fast, reliable and user-friendly tools are required. One of the fields that has gotten more attentions is the prediction of binding affinities. Two main problems have been identified for such methods: insufficient sampling and inaccurate models. This thesis is focused on tackling the first problem. To this end, we present the development of efficient methods for the estimation of protein-ligand binding free energies. We have developed a protocol that combines enhanced sampling with more standard simulations methods to achieve higher efficiency. First, we run an exploratory enhanced sampling simulation, starting from the bound conformation and partially biased towards unbound poses. The we leverage the information gained from this short simulation to run, longer unbiased simulations to collect statistics. Thanks to the modularity and automation that the protocol offers we were able to test three different methods for the long simulations: PELE, molecular dynamics and AdaptivePELE. PELE and molecular dynamics showed similar results, although PELE used less computational resources. Both seemed to work well with small protein-fragment systems or proteins with not very flexible binding sites. Both failed to accurately reproduce the binding of a kinase, the Mitogen-activated protein kinase 1 (ERK2). On the other hand, AdaptivePELE did not show a great improvement over PELE, with positive results for the Urokinase-type plasminogen activator (URO) and a clear lack of sampling for the Progesterone receptor (PR). We demonstrated the importance of well-designed suite of test systems for the development of new methods. Through the use of a diverse benchmark of protein systems we have established the cases in which the protocol is expected to give accurate results, and which areas require further development. This benchmark consisted of four proteins, and over 30 ligands, much larger than the test systems typically used in the development of pathway-based free energy methods. In summary, the methodology developed in this work can contribute to the drug discovery process for a limited range of protein systems. For many other, we have observed that regular unbiased simulations are not efficient enough and more sophisticated, enhanced sampling methods are required.Els grans avenços en la capacitat de computació han motivat l'esperança que els mètodes de simulacions per ordinador puguin accelerar el ritme de descobriment de nous fàrmacs. Per a què això sigui possible, es necessiten eines ràpides, acurades i fàcils d'utilitzar. Un dels problemes que han rebut més atenció és el de la predicció d'energies lliures d'unió entre proteïna i lligand. Dos grans problemes han estat identificats per a aquests mètodes: la falta de mostreig i les aproximacions dels models. Aquesta tesi està enfocada a resoldre el primer problema. Per a això, presentem el desenvolupament de mètodes eficients per a l'estimació de d'energies lliures d'unió entre proteïna i lligand. Hem desenvolupat un protocol que combina mètodes anomenats enhanced sampling amb simulació clàssiques per a obtenir una major eficiència. Els mètodes d'enhanced sampling són una classe d'eines que apliquen algun tipus de pertorbació externa al sistema que s'està estudiant per tal d'accelerar-ne el mostreig. En el nostre protocol, primer correm una simulació exploratòria d'enhanced sampling, començant per una mostra de la unió de la proteïna i el lligand. Aquesta simulació esta parcialment esbiaixada cap a aquells estats del sistema on els dos components es troben més separats. Després utilitzem la informació obtinguda d'aquesta primera simulació més curta per a córrer una segona simulació més llarga, amb mètodes sense biaix per obtenir una estadística fidedigna del sistema. Gràcies a la modularitat i el grau d'automatització que la implementació del protocol ofereix, hem pogut provar tres mètodes diferents per les simulacions llargues: PELE, dinàmica molecular i AdaptivePELE. PELE i dinàmica molecular han mostrat resultats similars, tot i que PELE utilitza menys recursos. Els dos han mostrat bons resultats en l'estudi de sistemes de fragments o amb proteïnes amb llocs d'unió poc flexibles. Però, els dos han fallat a l'hora de reproduir els resultats experimentals per a una quinasa, la Mitogen-activated protein kinase 1 (ERK2). D'altra banda, AdaptivePELE no ha mostrat una gran millora respecte a PELE, amb resultats positius per a la proteïna Urokinase-type plasminogen activator (URO) i una clara falta de mostreig per al receptor de progesterona (PR). En aquest treball hem demostrat la importància d'establir un banc de proves equilibrat durant el desenvolupament de nous mètodes. Mitjançant l'ús d'un banc de proves divers hem pogut establir en quins casos es pot esperar que el protocol obtingui resultats acurats, i quines àrees necessiten més desenvolupament. El banc de proves ha consistit de quatre proteïnes i més de trenta lligands, molt més dels que comunament s'utilitzen en el desenvolupament de mètodes per a la predicció d'energies d'unió mitjançant mètodes basats en camins (pathway-based). En resum, la metodologia desenvolupada durant aquesta tesi pot contribuir al procés de recerca de nous fàrmacs per a certs tipus de sistemes de proteïnes. Per a la resta, hem observat que els mètodes de simulació no esbiaixats no són prou eficients i tècniques més sofisticades són necessàries.Postprint (published version

    Theoretical and computational modeling of rna-ligand interactions

    Get PDF
    Ribonucleic acid (RNA) is a polymeric nucleic acid that plays a variety of critical roles in gene expression and regulation at the level of transcription and translation. Recently, there has been an enormous interest in the development of therapeutic strategies that target RNA molecules. Instead of modifying the product of gene expression, i.e., proteins, RNAtargeted therapeutics aims to modulate the relevant key RNA elements in the disease-related cellular pathways. Such approaches have two significant advantages. First, diseases with related proteins that are difficult or unable to be drugged become druggable by targeting the corresponding messenger RNAs (mRNAs) that encode the amino acid sequences. Second, besides coding mRNAs, the vast majority of the human genome sequences are transcribed to noncoding RNAs (ncRNAs), which serve as enzymatic, structural, and regulatory elements in cellular pathways of most human diseases. Targeting noncoding RNAs would open up remarkable new opportunities for disease treatment. The first step in modeling the RNA-drug interaction is to understand the 3D structure of the given RNA target. With current theoretical models, accurate prediction of 3D structures for large RNAs from sequence remains computationally infeasible. One of the major challenges comes from the flexibility in the RNA molecule, especially in loop/junction regions, and the resulting rugged energy landscape. However, structure probing techniques, such as the “selective 20-hydroxyl acylation analyzed by primer extension” (SHAPE) experiment, enable the quantitative detection of the relative flexibility and hence structure information of RNA structural elements. Therefore, one may incorporate the SHAPE data into RNA 3D structure prediction. In the first project, we investigate the feasibility of using a machine-learning-based approach to predict the SHAPE reactivity from the 3D RNA structure and compare the machine-learning result to that of a physics-based model. In the second project, in order to provide a user-friendly tool for RNA biologists, we developed a fully automated web interface, “SHAPE predictoR” (SHAPER) for predicting SHAPE profile from any given 3D RNA structure. In a cellular environment, various factors, such as metal ions and small molecules, interact with an RNA molecule to modulate RNA cellular activity. RNA is a highly charged polymer with each backbone phosphate group carrying one unit of negative (electronic) charge. In order to fold into a compact functional tertiary structure, it requires metal ions to reduce Coulombic repulsive electrostatic forces by neutralizing the backbone charges. In particular, Mg2+ ion is essential for the folding and stability of RNA tertiary structures. In the third project, we introduce a machine-learning-based model, the “Magnesium convolutional neural network” (MgNet) model, to predict Mg2+ binding site for a given 3D RNA structure, and show the use of the model in investigating the important coordinating RNA atoms and identifying novel Mg2+ binding motifs. Besides Mg2+ ions, small molecules, such as drug molecules, can also bind to an RNA to modulate its activities. Motivated by the tremendous potential of RNA-targeted drug discovery, in the fourth project, we develop a novel approach to predicting RNA-small molecule binding. Specifically, we develop a statistical potential-based scoring/ranking method (SPRank) to identify the native binding mode of the small molecule from a pool of decoys and estimate the binding affinity for the given RNA-small molecule complex. The results tested on a widely used data set suggest that SPRank can achieve (moderately) better performance than the current state-of-art models