Search CORE

6 research outputs found

PLAS-5k: Dataset of Protein-Ligand Affinities from Molecular Dynamics for Machine Learning Applications

Author: Bhati Agastya P
Garg Akshit
Jeurkar Shruti
Korlepara Divya B
Kumar Vishal
Mehta Sarvesh
Modee Rohit
Muvva Charuvaka
Nayar Divya
Pal Pradeep Kumar
Priyakumar U Deva
Roy Subhajit
Sharma Shubham
Sridharan Bhuvanesh
Vasavi CS
Publication venue: NATURE PORTFOLIO
Publication date: 07/09/2022
Field of study

Computational methods and recently modern machine learning methods have played a key role in structure-based drug design. Though several benchmarking datasets are available for machine learning applications in virtual screening, accurate prediction of binding affinity for a protein-ligand complex remains a major challenge. New datasets that allow for the development of models for predicting binding affinities better than the state-of-the-art scoring functions are important. For the first time, we have developed a dataset, PLAS-5k comprised of 5000 protein-ligand complexes chosen from PDB database. The dataset consists of binding affinities along with energy components like electrostatic, van der Waals, polar and non-polar solvation energy calculated from molecular dynamics simulations using MMPBSA (Molecular Mechanics Poisson-Boltzmann Surface Area) method. The calculated binding affinities outperformed docking scores and showed a good correlation with the available experimental values. The availability of energy components may enable optimization of desired components during machine learning-based drug design. Further, OnionNet model has been retrained on PLAS-5k dataset and is provided as a baseline for the prediction of binding affinities

UCL Discovery

PubMed Central

Supramolecular Polymerization of N,N′,N″,N‴-tetra-(Tetradecyl)-1,3,6,8-pyrenetetracarboxamide: A Computational Study

Author: Divya B. Korlepara (4672153)
Karteek K. Bejagam (1630741)
Sundaram Balasubramanian (810875)
Publication venue
Publication date
Field of study

The role of molecular dipole orientations and intermolecular interactions in a derivative of pyrene on its supramolecular self-assembly in solution has been investigated using quantum chemical and force field based computational approaches. Five possible dipole configurations of the molecule have been examined, among which the one in which adjacent dipole vectors are antiparallel to each other is determined to be the ground state, on electrostatic grounds. Self-assembly of this molecule under realistic conditions has been studied using MD simulations. Dipolar relaxation in its liquid crystalline (LC) phase has been investigated and contrasted against that in the well-established benzene-1,3,5-tricarboxamide (BTA) family. The dihedral barrier related to the amide dipole flip is larger in the pyrene system than in BTA which explains the differences in their dipolar relaxation behaviors. The mechanism underlying polarization switching upon the application of an external electric field in the LC phase is investigated. Unlike in BTA, this switching is not associated with a reversal of the helical sense of the hydrogen bonded chains, due to differences in molecular symmetry. The observations enable general conclusions on the relationship between electric field induced chiral enhancement and symmetry to be drawn

The Francis Crick Institute

PLAS-20k: Extended Dataset of Protein-Ligand Affinities from MD Simulations for Machine Learning Applications

Author: Aathira G. Nair
Divya B Korlepara
Divya Nayar
Indhu Ramachandran
Kavita Thakran
Pradeep Kumar Pal
Prathit Chatterjee
Rakesh Srivastava
Reena Jaglan
Saalim H. Raza
Sanjana Pandey
Shivam Pandit
Shivangi Verma
Shruti Jeurkar
Shubham Sharma
U. Deva Priyakumar
Vasavi C.S.
Vishal Kumar
Publication venue
Publication date: 07/08/2023
Field of study

Computing binding affinities is of great importance in drug discovery pipeline and its prediction using advanced machine learning methods still remains a major challenge as the existing datasets and models do not consider the dynamic features of protein-ligand interactions. To this end, we have developed PLAS-20k dataset, an extension of previously developed PLAS-5k, with 97,500 independent simulations on a total of 19,500 different protein-ligand complexes. Our results show good correlation with the available experimental values, performing better than docking scores. This holds true even for a subset of ligands that follows Lipinski’s rule, and for diverse clusters of complex structures, thereby highlighting the importance of PLAS-20k dataset in developing new ML models. Along with this, our dataset is also beneficial in classifying strong and weak binders compared to docking. Further, OnionNet model has been retrained on PLAS-20k dataset and is provided as a baseline for the prediction of binding affinities. We believe that large-scale MD-based datasets along with trajectories will form new synergy, paving the way for accelerating drug discovery

ChemRxiv

Author Correction: PLAS-20k: Extended Dataset of Protein-Ligand Affinities from MD Simulations for Machine Learning Applications

Author: Aathira G. Nair
C. S. Vasavi
Divya B. Korlepara
Divya Nayar
Indhu Ramachandran
Kavita Thakran
Pradeep Kumar Pal
Prathit Chatterjee
Rakesh Srivastava
Reena Jaglan
Saalim H. Raza
Sanjana Pandey
Shivam Pandit
Shivangi Verma
Shruti Jeurkar
Shubham Sharma
U. Deva Priyakumar
Vishal Kumar
Publication venue: Nature Portfolio
Publication date: 01/07/2024
Field of study

Directory of Open Access Journals

PLAS-20k: Extended Dataset of Protein-Ligand Affinities from MD Simulations for Machine Learning Applications

Author: Aathira G. Nair
Divya B. Korlepara
Divya Nayar
Indhu Ramachandran
Kavita Thakran
Pradeep Kumar Pal
Prathit Chatterjee
Rakesh Srivastava
Reena Jaglan
Saalim H. Raza
Sanjana Pandey
Shivam Pandit
Shivangi Verma
Shruti Jeurkar
Shubham Sharma
U. Deva Priyakumar
Vasavi C. S.
Vishal Kumar
Publication venue: Nature Portfolio
Publication date: 01/02/2024
Field of study

Abstract Computing binding affinities is of great importance in drug discovery pipeline and its prediction using advanced machine learning methods still remains a major challenge as the existing datasets and models do not consider the dynamic features of protein-ligand interactions. To this end, we have developed PLAS-20k dataset, an extension of previously developed PLAS-5k, with 97,500 independent simulations on a total of 19,500 different protein-ligand complexes. Our results show good correlation with the available experimental values, performing better than docking scores. This holds true even for a subset of ligands that follows Lipinski’s rule, and for diverse clusters of complex structures, thereby highlighting the importance of PLAS-20k dataset in developing new ML models. Along with this, our dataset is also beneficial in classifying strong and weak binders compared to docking. Further, OnionNet model has been retrained on PLAS-20k dataset and is provided as a baseline for the prediction of binding affinities. We believe that large-scale MD-based datasets along with trajectories will form new synergy, paving the way for accelerating drug discovery

Directory of Open Access Journals