29,345 research outputs found

    Curiosity in exploring chemical spaces: Intrinsic rewards for deep molecular reinforcement learning

    Get PDF
    Computer-aided design of molecules has the potential to disrupt the field of drug and material discovery. Machine learning, and deep learning, in particular, have been topics where the field has been developing at a rapid pace. Reinforcement learning is a particularly promising approach since it allows for molecular design without prior knowledge. However, the search space is vast and efficient exploration is desirable when using reinforcement learning agents. In this study, we propose an algorithm to aid efficient exploration. The algorithm is inspired by a concept known in the literature as curiosity. We show on three benchmarks that a curious agent finds better performing molecules. This indicates an exciting new research direction for reinforcement learning agents that can explore the chemical space out of their own motivation. This has the potential to eventually lead to unexpected new molecules that no human has thought about so far

    Machine Learning Model for Repurposing Drugs to Target Viral Diseases

    Get PDF
    With recent events, such as the Covid-19 pandemic, it is increasingly important to develop strategies to combat viral diseases. Due to technological advancements, computer-aided drug design and machine learning (ML)-based hit identification strategies have gained popularity. Applying these techniques to identify novel scaffolds and/or repurpose existing therapeutics for viral diseases is a promising approach. As an avenue to improve existing classification models for antiviral applications, this thesis aimed to make improvements to non-binding data selection within these models. We created a classification model using molecular fingerprints to assess the performance of machine learning predictions when the model is trained using randomly selected and rationally selected non-binding datasets. Our analyses revealed that machine learning predictions can be improved using a rational selection approach. We further used this approach and trained three machine learning models based on XGBoost, Random Forest, and Support Vector Machine to predict potential inhibitors for the SARS-CoV2 main protease (Mpro) enzyme. Probability-ranked hits from the combined model were further analyzed using classical structure-based methods. The binding modes and affinities of the hits were identified using AutoDock Vina, and molecular dynamics simulations-enabled MM-GBSA calculations. The top hits identified from this multi-step screening approach revealed potential candidates that show improved affinity and stability than existing non-covalent Mpro inhibitors. Thus, our approach and the model could be useful for screening large ligand libraries

    TeachOpenCADD: a teaching platform for computer-aided drug design using open source packages and data

    Get PDF
    Owing to the increase in freely available software and data for cheminformatics and structural bioinformatics, research for computer-aided drug design (CADD) is more and more built on modular, reproducible, and easy-to-share pipelines. While documentation for such tools is available, there are only a few freely accessible examples that teach the underlying concepts focused on CADD, especially addressing users new to the field. Here, we present TeachOpenCADD, a teaching platform developed by students for students, using open source compound and protein data as well as basic and CADD-related Python packages. We provide interactive Jupyter notebooks for central CADD topics, integrating theoretical background and practical code. TeachOpenCADD is freely available on GitHub: https://github.com/volkamerlab/TeachOpenCAD

    Retrosynthetic reaction prediction using neural sequence-to-sequence models

    Full text link
    We describe a fully data driven model that learns to perform a retrosynthetic reaction prediction task, which is treated as a sequence-to-sequence mapping problem. The end-to-end trained model has an encoder-decoder architecture that consists of two recurrent neural networks, which has previously shown great success in solving other sequence-to-sequence prediction tasks such as machine translation. The model is trained on 50,000 experimental reaction examples from the United States patent literature, which span 10 broad reaction types that are commonly used by medicinal chemists. We find that our model performs comparably with a rule-based expert system baseline model, and also overcomes certain limitations associated with rule-based expert systems and with any machine learning approach that contains a rule-based expert system component. Our model provides an important first step towards solving the challenging problem of computational retrosynthetic analysis

    11th German Conference on Chemoinformatics (GCC 2015) : Fulda, Germany. 8-10 November 2015.

    Get PDF
    corecore