5,217 research outputs found

    The Synthesizability of Molecules Proposed by Generative Models

    Full text link
    The discovery of functional molecules is an expensive and time-consuming process, exemplified by the rising costs of small molecule therapeutic discovery. One class of techniques of growing interest for early-stage drug discovery is de novo molecular generation and optimization, catalyzed by the development of new deep learning approaches. These techniques can suggest novel molecular structures intended to maximize a multi-objective function, e.g., suitability as a therapeutic against a particular target, without relying on brute-force exploration of a chemical space. However, the utility of these approaches is stymied by ignorance of synthesizability. To highlight the severity of this issue, we use a data-driven computer-aided synthesis planning program to quantify how often molecules proposed by state-of-the-art generative models cannot be readily synthesized. Our analysis demonstrates that there are several tasks for which these models generate unrealistic molecular structures despite performing well on popular quantitative benchmarks. Synthetic complexity heuristics can successfully bias generation toward synthetically-tractable chemical space, although doing so necessarily detracts from the primary objective. This analysis suggests that to improve the utility of these models in real discovery workflows, new algorithm development is warranted

    Computer Aided Synthesis Prediction to Enable Augmented Chemical Discovery and Chemical Space Exploration

    Get PDF
    The drug-like chemical space is estimated to be 10 to the power of 60 molecules, and the largest generated database (GDB) obtained by the Reymond group is 165 billion molecules with up to 17 heavy atoms. Furthermore, deep learning techniques to explore regions of chemical space are becoming more popular. However, the key to realizing the generated structures experimentally lies in chemical synthesis. The application of which was previously limited to manual planning or slow computer assisted synthesis planning (CASP) models. Despite the 60-year history of CASP few synthesis planning tools have been open-sourced to the community. In this thesis I co-led the development of and investigated one of the only fully open-source synthesis planning tools called AiZynthFinder, trained on both public and proprietary datasets consisting of up to 17.5 million reactions. This enables synthesis guided exploration of the chemical space in a high throughput manner, to bridge the gap between compound generation and experimental realisation. I firstly investigate both public and proprietary reaction data, and their influence on route finding capability. Furthermore, I develop metrics for assessment of retrosynthetic prediction, single-step retrosynthesis models, and automated template extraction workflows. This is supplemented by a comparison of the underlying datasets and their corresponding models. Given the prevalence of ring systems in the GDB and wider medicinal chemistry domain, I developed ‘Ring Breaker’ - a data-driven approach to enable the prediction of ring-forming reactions. I demonstrate its utility on frequently found and unprecedented ring systems, in agreement with literature syntheses. Additionally, I highlight its potential for incorporation into CASP tools, and outline methodological improvements that result in the improvement of route-finding capability. To tackle the challenge of model throughput, I report a machine learning (ML) based classifier called the retrosynthetic accessibility score (RAscore), to assess the likelihood of finding a synthetic route using AiZynthFinder. The RAscore computes at least 4,500 times faster than AiZynthFinder. Thus, opens the possibility of pre-screening millions of virtual molecules from enumerated databases or generative models for synthesis informed compound prioritization. Finally, I combine chemical library visualization with synthetic route prediction to facilitate experimental engagement with synthetic chemists. I enable the navigation of chemical property space by using interactive visualization to deliver associated synthetic data as endpoints. This aids in the prioritization of compounds. The ability to view synthetic route information alongside structural descriptors facilitates a feedback mechanism for the improvement of CASP tools and enables rapid hypothesis testing. I demonstrate the workflow as applied to the GDB databases to augment compound prioritization and synthetic route design

    In silico Strategies to Support Fragment-to-Lead Optimization in Drug Discovery.

    Get PDF
    Fragment-based drug (or lead) discovery (FBDD or FBLD) has developed in the last two decades to become a successful key technology in the pharmaceutical industry for early stage drug discovery and development. The FBDD strategy consists of screening low molecular weight compounds against macromolecular targets (usually proteins) of clinical relevance. These small molecular fragments can bind at one or more sites on the target and act as starting points for the development of lead compounds. In developing the fragments attractive features that can translate into compounds with favorable physical, pharmacokinetics and toxicity (ADMET-absorption, distribution, metabolism, excretion, and toxicity) properties can be integrated. Structure-enabled fragment screening campaigns use a combination of screening by a range of biophysical techniques, such as differential scanning fluorimetry, surface plasmon resonance, and thermophoresis, followed by structural characterization of fragment binding using NMR or X-ray crystallography. Structural characterization is also used in subsequent analysis for growing fragments of selected screening hits. The latest iteration of the FBDD workflow employs a high-throughput methodology of massively parallel screening by X-ray crystallography of individually soaked fragments. In this review we will outline the FBDD strategies and explore a variety of in silico approaches to support the follow-up fragment-to-lead optimization of either: growing, linking, and merging. These fragment expansion strategies include hot spot analysis, druggability prediction, SAR (structure-activity relationships) by catalog methods, application of machine learning/deep learning models for virtual screening and several de novo design methods for proposing synthesizable new compounds. Finally, we will highlight recent case studies in fragment-based drug discovery where in silico methods have successfully contributed to the development of lead compounds

    Retrosynthetic reaction prediction using neural sequence-to-sequence models

    Full text link
    We describe a fully data driven model that learns to perform a retrosynthetic reaction prediction task, which is treated as a sequence-to-sequence mapping problem. The end-to-end trained model has an encoder-decoder architecture that consists of two recurrent neural networks, which has previously shown great success in solving other sequence-to-sequence prediction tasks such as machine translation. The model is trained on 50,000 experimental reaction examples from the United States patent literature, which span 10 broad reaction types that are commonly used by medicinal chemists. We find that our model performs comparably with a rule-based expert system baseline model, and also overcomes certain limitations associated with rule-based expert systems and with any machine learning approach that contains a rule-based expert system component. Our model provides an important first step towards solving the challenging problem of computational retrosynthetic analysis

    A Combination of Receptor-Based Pharmacophore Modeling & QM Techniques for Identification of Human Chymase Inhibitors

    Get PDF
    Inhibition of chymase is likely to divulge therapeutic ways for the treatment of cardiovascular diseases, and fibrotic disorders. To find novel and potent chymase inhibitors and to provide a new idea for drug design, we used both ligand-based and structure-based methods to perform the virtual screening(VS) of commercially available databases. Different pharmacophore models generated from various crystal structures of enzyme may depict diverse inhibitor binding modes. Therefore, multiple pharmacophore-based approach is applied in this study. X-ray crystallographic data of chymase in complex with different inhibitors were used to generate four structure–based pharmacophore models. One ligand–based pharmacophore model was also developed from experimentally known inhibitors. After successful validation, all pharmacophore models were employed in database screening to retrieve hits with novel chemical scaffolds. Drug-like hit compounds were subjected to molecular docking using GOLD and AutoDock. Finally four structurally diverse compounds with high GOLD score and binding affinity for several crystal structures of chymase were selected as final hits. Identification of final hits by three different pharmacophore models necessitates the use of multiple pharmacophore-based approach in VS process. Quantum mechanical calculation is also conducted for analysis of electrostatic characteristics of compounds which illustrates their significant role in driving the inhibitor to adopt a suitable bioactive conformation oriented in the active site of enzyme. In general, this study is used as example to illustrate how multiple pharmacophore approach can be useful in identifying structurally diverse hits which may bind to all possible bioactive conformations available in the active site of enzyme. The strategy used in the current study could be appropriate to design drugs for other enzymes as well

    Identification of Potential Insect Growth Inhibitor against Aedes aegypti: A Bioinformatics Approach

    Get PDF
    Aedes aegypti is the main vector that transmits viral diseases such as dengue, hemorrhagic dengue, urban yellow fever, zika, and chikungunya. Worldwide, many cases of dengue have been reported in recent years, showing significant growth. The best way to manage diseases transmitted by Aedes aegypti is to control the vector with insecticides, which have already been shown to be toxic to humans; moreover, insects have developed resistance. Thus, the development of new insecticides is considered an emergency. One way to achieve this goal is to apply computational methods based on ligands and target information. In this study, sixteen compounds with acceptable insecticidal activities, with 100% larvicidal activity at low concentrations (2.0 to 0.001 mg center dot L-1), were selected from the literature. These compounds were used to build up and validate pharmacophore models. Pharmacophore model 6 (AUC = 0.78; BEDROC = 0.6) was used to filter 4793 compounds from the subset of lead-like compounds from the ZINC database; 4142 compounds (dG < 0 kcal/mol) were then aligned to the active site of the juvenile hormone receptor Aedes aegypti (PDB: 5V13), 2240 compounds (LE < -0.40 kcal/mol) were prioritized for molecular docking from the construction of a chitin deacetylase model of Aedes aegypti by the homology modeling of the Bombyx mori species (PDB: 5ZNT), which aligned 1959 compounds (dG < 0 kcal/mol), and 20 compounds (LE < -0.4 kcal/mol) were predicted for pharmacokinetic and toxicological prediction in silico (Preadmet, SwissADMET, and eMolTox programs). Finally, the theoretical routes of compounds M01, M02, M03, M04, and M05 were proposed. Compounds M01-M05 were selected, showing significant differences in pharmacokinetic and toxicological parameters in relation to positive controls and interaction with catalytic residues among key protein sites reported in the literature. For this reason, the molecules investigated here are dual inhibitors of the enzymes chitin synthase and juvenile hormonal protein from insects and humans, characterizing them as potential insecticides against the Aedes aegypti mosquito.Laboratory of Cellular Immunology Applied to Health of the Oswaldo Cruz Foundation (FIOCRUZ)Department of Pharmaceutical and Organic Chemistry, Faculty of Pharmacy of the University of Granada (Spain)Researcher Assistance Program-PAPESQ/UNIFA

    In silico design of cyclic peptides as influenza virus, a subtype H1N1 neuraminidase inhibitor

    Get PDF
    Nowadays, influenza has become a global public health concern because it is responsible for significant morbidity and mortality due to annual epidemics and unpredictable pandemics. There are only limited options to control this respiratory disease. Vaccine treatment is useless for controlling this disease because of the occurrence of mutation in the influenza virus. Influenza virus is also resistant to some antiviral drugs like oseltamivir and zanamivir, which inhibit neuraminidase. Another solution for controlling this virus is to find new design for antiviral drugs. Cyclic peptides can be used to make new antiviral drug design especially to inhibit neuraminidase activity by using ’structure-based design’ method. Based on molecular docking, new antiviral drug designs have been found. They are DNY, NNY, DDY, DYY, RRR, RPR, RRP and LRL. These cyclic peptides showed better activity and affinity than standard ligand to inhibit neuraminidase activity. From drug scan, DNY, NNY and LRL ligands have low toxicity and were predicted to have at least 59% possibility that it could be synthesized in wet laboratory experiment.Key words: Influenza virus A, neuraminidase, cyclic peptide, structure based design, molecular docking
    • 

    corecore