1,640 research outputs found

    Using a Genetic Algorithm to Find Molecules with Good Docking Scores

    Get PDF
    A graph-based genetic algorithm (GA) is used to identify molecules (ligands) with high absolute docking scores as estimated by the Glide software package, starting from randomly chosen molecules from the ZINC database, for four different targets: Bacillus subtilis chorismate mutase (CM), human β2-adrenergic G protein-coupled receptor (β2AR), the DDR1 kinase domain (DDR1), and β-cyclodextrin (BCD). By the combined use of functional group filters and a score modifier based on a heuristic synthetic accessibility (SA) score our approach identifies between ca 500 and 6,000 structurally diverse molecules with scores better than known binders by screening a total of 400,000 molecules starting from 8,000 randomly selected molecules from the ZINC database. Screening 250,000 molecules from the ZINC database identifies significantly more molecules with better docking scores than known binders, with the exception of CM, where the conventional screening approach only identifies 60 compounds compared to 511 with GA+Filter+SA. In the case of β2AR and DDR1, the GA+Filter+SA approach finds significantly more molecules with docking scores lower than −9.0 and −10.0. The GA+Filters+SA docking methodology is thus effective in generating a large and diverse set of synthetically accessible molecules with very good docking scores for a particular target. An early incarnation of the GA+Filter+SA approach was used to identify potential binders to the COVID-19 main protease and submitted to the early stages of the COVID Moonshot project, a crowd-sourced initiative to accelerate the development of a COVID antiviral

    Anthropogenic reaction parameters - the missing link between chemical intuition and the available chemical space

    Get PDF
    How do skilled synthetic chemists develop such a good intuitive expertise ? Why can we only access such a small amount of the available chemical space — both in terms of the re actions used and the chemical scaffolds we make? We argue here that these seemingly unrelated questions have a common root and are strongly interdependent . We performed a comprehensive analysis of organic reaction parameters dating back to 1771 and discove red that there are several anthropogenic factors that limit the reaction parameters and thus the scop e of synthetic chemistry. Nevertheless, many of the anthropogenic limitations such as the narrow parameter space and the opportunity of the rapid and clear feedback on the progress of reactions appear to be crucial for the acquisition of valid and reliable chemical intuition. In parallel, however, all of these same factors represent limitations for the exploration of available chemistry space and we argue th at these are thus at least partly responsible for limited access to new chemistries. We advocate, therefore, that the present anthropogenic boundaries can be expanded by a more conscious expl oration of “off - road” chemistry that would also extend the intuit ive knowledge of trained chemists

    The Synthesizability of Molecules Proposed by Generative Models

    Full text link
    The discovery of functional molecules is an expensive and time-consuming process, exemplified by the rising costs of small molecule therapeutic discovery. One class of techniques of growing interest for early-stage drug discovery is de novo molecular generation and optimization, catalyzed by the development of new deep learning approaches. These techniques can suggest novel molecular structures intended to maximize a multi-objective function, e.g., suitability as a therapeutic against a particular target, without relying on brute-force exploration of a chemical space. However, the utility of these approaches is stymied by ignorance of synthesizability. To highlight the severity of this issue, we use a data-driven computer-aided synthesis planning program to quantify how often molecules proposed by state-of-the-art generative models cannot be readily synthesized. Our analysis demonstrates that there are several tasks for which these models generate unrealistic molecular structures despite performing well on popular quantitative benchmarks. Synthetic complexity heuristics can successfully bias generation toward synthetically-tractable chemical space, although doing so necessarily detracts from the primary objective. This analysis suggests that to improve the utility of these models in real discovery workflows, new algorithm development is warranted

    Rethinking drug design in the artificial intelligence era

    Get PDF
    Artificial intelligence (AI) tools are increasingly being applied in drug discovery. While some protagonists point to vast opportunities potentially offered by such tools, others remain sceptical, waiting for a clear impact to be shown in drug discovery projects. The reality is probably somewhere in-between these extremes, yet it is clear that AI is providing new challenges not only for the scientists involved but also for the biopharma industry and its established processes for discovering and developing new medicines. This article presents the views of a diverse group of international experts on the 'grand challenges' in small-molecule drug discovery with AI and the approaches to address them

    Computer Aided Synthesis Prediction to Enable Augmented Chemical Discovery and Chemical Space Exploration

    Get PDF
    The drug-like chemical space is estimated to be 10 to the power of 60 molecules, and the largest generated database (GDB) obtained by the Reymond group is 165 billion molecules with up to 17 heavy atoms. Furthermore, deep learning techniques to explore regions of chemical space are becoming more popular. However, the key to realizing the generated structures experimentally lies in chemical synthesis. The application of which was previously limited to manual planning or slow computer assisted synthesis planning (CASP) models. Despite the 60-year history of CASP few synthesis planning tools have been open-sourced to the community. In this thesis I co-led the development of and investigated one of the only fully open-source synthesis planning tools called AiZynthFinder, trained on both public and proprietary datasets consisting of up to 17.5 million reactions. This enables synthesis guided exploration of the chemical space in a high throughput manner, to bridge the gap between compound generation and experimental realisation. I firstly investigate both public and proprietary reaction data, and their influence on route finding capability. Furthermore, I develop metrics for assessment of retrosynthetic prediction, single-step retrosynthesis models, and automated template extraction workflows. This is supplemented by a comparison of the underlying datasets and their corresponding models. Given the prevalence of ring systems in the GDB and wider medicinal chemistry domain, I developed ‘Ring Breaker’ - a data-driven approach to enable the prediction of ring-forming reactions. I demonstrate its utility on frequently found and unprecedented ring systems, in agreement with literature syntheses. Additionally, I highlight its potential for incorporation into CASP tools, and outline methodological improvements that result in the improvement of route-finding capability. To tackle the challenge of model throughput, I report a machine learning (ML) based classifier called the retrosynthetic accessibility score (RAscore), to assess the likelihood of finding a synthetic route using AiZynthFinder. The RAscore computes at least 4,500 times faster than AiZynthFinder. Thus, opens the possibility of pre-screening millions of virtual molecules from enumerated databases or generative models for synthesis informed compound prioritization. Finally, I combine chemical library visualization with synthetic route prediction to facilitate experimental engagement with synthetic chemists. I enable the navigation of chemical property space by using interactive visualization to deliver associated synthetic data as endpoints. This aids in the prioritization of compounds. The ability to view synthetic route information alongside structural descriptors facilitates a feedback mechanism for the improvement of CASP tools and enables rapid hypothesis testing. I demonstrate the workflow as applied to the GDB databases to augment compound prioritization and synthetic route design

    11th German Conference on Chemoinformatics (GCC 2015) : Fulda, Germany. 8-10 November 2015.

    Get PDF

    Going Small: Using Biophysical Screening to Implement Fragment Based Drug Discovery

    Get PDF
    Screening against biochemical targets with compact chemical fragments has developed a reputation as a successful early‐stage drug discovery approach, thanks to recent drug approvals. Having weak initial target affinities, fragments require the use of sensitive biophysical technologies (NMR, SPR, thermal shift, ITC, and X‐ray crystallography) to accommodate the practical limits of going smaller. Application of optimized fragment biophysical screening approaches now routinely allows for the rapid identification of fragments with high binding efficiencies. The aim of this chapter is to provide an introduction to fragment library selection and to discuss the suitability of screening approaches adapted for lower‐throughput biophysical techniques. A general description of metrics that are being used in the progression of fragment hits, the need for orthogonal assay testing, and guidance on potential pitfalls are included to assist scientists, considering initiating their own fragment discovery program
    corecore