4 research outputs found

    PLAS-5k: Dataset of Protein-Ligand Affinities from Molecular Dynamics for Machine Learning Applications

    Get PDF
    Computational methods and recently modern machine learning methods have played a key role in structure-based drug design. Though several benchmarking datasets are available for machine learning applications in virtual screening, accurate prediction of binding affinity for a protein-ligand complex remains a major challenge. New datasets that allow for the development of models for predicting binding affinities better than the state-of-the-art scoring functions are important. For the first time, we have developed a dataset, PLAS-5k comprised of 5000 protein-ligand complexes chosen from PDB database. The dataset consists of binding affinities along with energy components like electrostatic, van der Waals, polar and non-polar solvation energy calculated from molecular dynamics simulations using MMPBSA (Molecular Mechanics Poisson-Boltzmann Surface Area) method. The calculated binding affinities outperformed docking scores and showed a good correlation with the available experimental values. The availability of energy components may enable optimization of desired components during machine learning-based drug design. Further, OnionNet model has been retrained on PLAS-5k dataset and is provided as a baseline for the prediction of binding affinities

    MolOpt: Autonomous Molecular Geometry Optimization using Multi-Agent Reinforcement Learning

    No full text
    In this paper, we propose MolOpt, the first attempt of its kind to use Multi-Agent Reinforcement Learning (MARL) for autonomous molecular geometry optimization (MGO). Typically MGO algorithms are hand-designed, but MolOpt uses MARL to learn a learned optimizer (policy) that can perform MGO without depending on other hand-designed optimizers. We cast MGO as a MARL problem, where each agent corresponds to a single atom in the molecule. MolOpt performs MGO by minimizing the forces on each atom in the molecule. Our experiments demonstrate the generalizing ability of MolOpt for MGO of Propane, Pentane, Heptane, Hexane, and Octane when trained on Ethane, Butane, and Isobutane. In terms of performance, MolOpt outperforms the MDMin optimizer and demonstrates similar performance to the FIRE optimizer. However, it does not surpass the BFGS optimizer. The results demonstrate that MolOpt has the potential to introduce innovative advancements in MGO by providing a novel approach using reinforcement learning (RL), which may open up new research directions for MGO. Overall, this work serves as a proof-of-concept for the potential of MARL in MGO

    MeGen - generation of gallium metal clusters using reinforcement learning

    No full text
    The generation of low-energy 3D structures of metal clusters depends on the efficiency of the search algorithm and the accuracy of inter-atomic interaction description. In this work, we formulate the search algorithm as a reinforcement learning (RL) problem. Concisely, we propose a novel actor-critic architecture that generates low-lying isomers of metal clusters at a fraction of computational cost than conventional methods. Our RL-based search algorithm uses a previously developed DART model as a reward function to describe the inter-atomic interactions to validate predicted structures. Using the DART model as a reward function incentivizes the RL model to generate low-energy structures and helps generate valid structures. We demonstrate the advantages of our approach over conventional methods for scanning local minima on potential energy surface. Our approach not only generates isomer of gallium clusters at a minimal computational cost but also predicts isomer families that were not discovered through previous density-functional theory (DFT)-based approaches

    DART: Deep Learning Enabled Topological Interaction Model for Energy Prediction of Metal Clusters and its Application in Identifying Unique Low Energy Isomers

    No full text
    Recently, Machine Learning (ML) has proven to yield fast and accurate predictions of chemical properties to accelerate the discovery of novel molecules and materials. The majority of the work is on organic molecules, and much more work needs to be done for inorganic molecules, especially clusters. In the present work, we introduce a simple Topological Atomic Descriptor called TAD, which encodes chemical environment information of each atom in the cluster. TAD is a simple and interpretable descriptor where each value represents the atom count in three shells. We also introduce the DART, Deep Learning Enabled Topological Interaction model, which uses TAD as a feature vector to predict energies of metal clusters, in our case Gallium clusters with size ranging from 31 to 70 atoms. DART model is designed based on the principle that energy is a function of atomic interactions and allows us to model these complex atomic interactions to predict the energy. We further introduce a new dataset called GNC_31-70, which comprises structures and DFT optimized energies of Gallium clusters with sizes ranging from 31 to 70 atoms. We show how DART can be used to accelerate the identification of ground-state structures without geometry optimization. Albeit using topological descriptor, DART achieves MAE of 3.59 kcal/mol (0.15 eV) on testset. We also show that our model can distinguish core and surface atoms in the Ga-70 cluster, which the model has never encountered earlier. Finally, we demonstrate the transferability of DART model by predicting energies for about 6k unseen configurations picked up from Molecular Dynamics (MD) data for three cluster sizes (46, 57, and 60) within seconds. The DART model was able to reduce the load on DFT optimizations while identifying unique low energy structures from MD data.</div
    corecore