904 research outputs found

    A treatment of stereochemistry in computer aided organic synthesis

    Get PDF
    This thesis describes the author’s contributions to a new stereochemical processing module constructed for the ARChem retrosynthesis program. The purpose of the module is to add the ability to perform enantioselective and diastereoselective retrosynthetic disconnections and generate appropriate precursor molecules. The module uses evidence based rules generated from a large database of literature reactions. Chapter 1 provides an introduction and critical review of the published body of work for computer aided synthesis design. The role of computer perception of key structural features (rings, functions groups etc.) and the construction and use of reaction transforms for generating precursors is discussed. Emphasis is also given to the application of strategies in retrosynthetic analysis. The availability of large reaction databases has enabled a new generation of retrosynthesis design programs to be developed that use automatically generated transforms assembled from published reactions. A brief description of the transform generation method employed by ARChem is given. Chapter 2 describes the algorithms devised by the author for handling the computer recognition and representation of the stereochemical features found in molecule and reaction scheme diagrams. The approach is generalised and uses flexible recognition patterns to transform information found in chemical diagrams into concise stereo descriptors for computer processing. An algorithm for efficiently comparing and classifying pairs of stereo descriptors is described. This algorithm is central for solving the stereochemical constraints in a variety of substructure matching problems addressed in chapter 3. The concise representation of reactions and transform rules as hyperstructure graphs is described. Chapter 3 is concerned with the efficient and reliable detection of stereochemical symmetry in both molecules, reactions and rules. A novel symmetry perception algorithm, based on a constraints satisfaction problem (CSP) solver, is described. The use of a CSP solver to implement an isomorph‐free matching algorithm for stereochemical substructure matching is detailed. The prime function of this algorithm is to seek out unique retron locations in target molecules and then to generate precursor molecules without duplications due to symmetry. Novel algorithms for classifying asymmetric, pseudo‐asymmetric and symmetric stereocentres; meso, centro, and C2 symmetric molecules; and the stereotopicity of trigonal (sp2) centres are described. Chapter 4 introduces and formalises the annotated structural language used to create both retrosynthetic rules and the patterns used for functional group recognition. A novel functional group recognition package is described along with its use to detect important electronic features such as electron‐withdrawing or donating groups and leaving groups. The functional groups and electronic features are used as constraints in retron rules to improve transform relevance. Chapter 5 details the approach taken to design detailed stereoselective and substrate controlled transforms from organised hierarchies of rules. The rules employ a rich set of constraints annotations that concisely describe the keying retrons. The application of the transforms for collating evidence based scoring parameters from published reaction examples is described. A survey of available reaction databases and the techniques for mining stereoselective reactions is demonstrated. A data mining tool was developed for finding the best reputable stereoselective reaction types for coding as transforms. For various reasons it was not possible during the research period to fully integrate this work with the ARChem program. Instead, Chapter 6 introduces a novel one‐step retrosynthesis module to test the developed transforms. The retrosynthesis algorithms use the organisation of the transform rule hierarchy to efficiently locate the best retron matches using all applicable stereoselective transforms. This module was tested using a small set of selected target molecules and the generated routes were ranked using a series of measured parameters including: stereocentre clearance and bond cleavage; example reputation; estimated stereoselectivity with reliability; and evidence of tolerated functional groups. In addition a method for detecting regioselectivity issues is presented. This work presents a number of algorithms using common set and graph theory operations and notations. Appendix A lists the set theory symbols and meanings. Appendix B summarises and defines the common graph theory terminology used throughout this thesis

    Geodesic grassfire for computing mixed-dimensional skeletons

    Get PDF
    Skeleton descriptors are commonly used to represent, understand and process shapes. While existing methods produce skeletons at a fixed dimension, such as surface or curve skeletons for a 3D object, often times objects are better described using skeleton geometry at a mixture of dimensions. In this paper we present a novel algorithm for computing mixed-dimensional skeletons. Our method is guided by a continuous analogue that extends the classical grassfire erosion. This analogue allows us to identify medial geometry at multiple dimensions, and to formulate a measure that captures how well an object part is described by medial geometry at a particular dimension. Guided by this analogue, we devise a discrete algorithm that computes a topology-preserving skeleton by iterative thinning. The algorithm is simple to implement, and produces robust skeletons that naturally capture shape components. Under Revie

    Lingo3DMol: Generation of a Pocket-based 3D Molecule using a Language Model

    Full text link
    Structure-based drug design powered by deep generative models have attracted increasing research interest in recent years. Language models have demonstrated a robust capacity for generating valid molecules in 2D structures, while methods based on geometric deep learning can directly produce molecules with accurate 3D coordinates. Inspired by both methods, this article proposes a pocket-based 3D molecule generation method that leverages the language model with the ability to generate 3D coordinates. High quality protein-ligand complex data are insufficient; hence, a perturbation and restoration pre-training task is designed that can utilize vast amounts of small-molecule data. A new molecular representation, a fragment-based SMILES with local and global coordinates, is also presented, enabling the language model to learn molecular topological structures and spatial position information effectively. Ultimately, CrossDocked and DUD-E dataset is employed for evaluation and additional metrics are introduced. This method achieves state-of-the-art performance in nearly all metrics, notably in terms of binding patterns, drug-like properties, rational conformations, and inference speed. Our model is available as an online service to academic users via sw3dmg.stonewise.c

    Molecular Considerations In The Design Of Novel Alpha/Beta Hydrolase Inhibitors

    Get PDF
    Alpha/beta hydrolases (ABHs) are a superfamily of hydrolytic enzymes that process a wide variety of substrates. A subfamily of ABHs called carboxylesterases (CEs) are important enzymes that catalyze biological detoxification, hydrolysis of certain pesticides, and metabolism of many esterified drugs. The chemotherapy drug irinotecan used for treatment of colorectal cancer is metabolized to SN-38, the active drug metabolite, by two CE isozymes CES1 (localized in the liver) and CES2 (localized in the small intestines). CES2\u27s ability to activate irinotecan at a faster rate than CES1 creates a localization of activated SN-38 in the gut epithelium, resulting in the dose limiting side effect of delayed diarrhea. Development of inhibitors for the CE subfamily of ABHs could assist in ameliorating the toxic side effects associated with some esterified prodrugs such as irinotecan, and enhance the distribution of prodrugs in vivo. Hence, our research targets CES2 for inhibitor design with the goal of amelioration of intestinal cytotoxicity associated with irinotecan chemotherapy. In this work we (i) utilized QSAR technology to design and optimize novel sulfonamide CES2 inhibitors; (ii) combined QSAR with in silico design to generate new CE inhibitor scaffolds that maintained the potency of previous CE inhibitor generations, yet had improved water solubility; and ( iii) investigated the contribution of the loop 7 in CEs to sensitizing the enzyme to inhibition by sulfonamides through docking analysis. Our QSAR model, developed using 57 sulfonamide analogs, identified several features of this class of CE inhibitor that confer their potency. Using a QSAR model, constructed using 4 classes of CE inhibitors (benzils, benzoins, isatins, and sulfonamides), as a pocket site to perform in silico design we generated several new scaffolds predicted to have good solubility and potency. This work suggests that the inner loop 7 on CE plays a role in inhibitor selectivity, and interactions with this loop should be considered in the development of selective CE inhibitors. The contributions from this work will be applicable to the design of novel ABH inhibitors, help to increase the likelihood of these drugs entering in clinical use, and ameliorate the dose-limiting side effect associated with irinotecan

    Semi-immersive space mission design and visualization: case study of the "terrestrial planet finder" mission

    Get PDF
    The paper addresses visualization issues of the Terrestrial Planet Finder Mission (C.A. Beichman et al., 1999). The goal of this mission is to search for chemical signatures of life in distant solar systems using five satellites flying in formation to simulate a large telescope. To design and visually verify such a delicate mission, one has to analyze and interact with many different 3D spacecraft trajectories, which is often difficult in 2D. We employ a novel trajectory design approach using invariant manifold theory, which is best understood and utilized in an immersive setting. The visualization also addresses multi-scale issues related to the vast differences in distance, velocity, and time at different phases of the mission. Additionally, the parameterization and coordinate frames used for numerical simulations may not be suitable for direct visualization. Relative motion presents a more serious problem where the patterns of the trajectories can only be viewed in particular rotating frames. Some of these problems are greatly relieved by using interactive, animated stereo 3D visualization in a semi-immersive environment such as a Responsive Workbench. Others were solved using standard techniques such as a stratify approach with multiple windows to address the multiscale issues, re-parameterizations of trajectories and associated 2D manifolds and relative motion of the camera to "evoke" the desired patterns

    SARs for the Antiparasitic Plant metabolite Pulchrol

    Get PDF
    Pulchrol, a natural compound isolated from the roots of the vegetal specie Bourreria pulchra has been shown to possess potential antiparasitic activity toward Trypanosomatids, particularly against Trypanosoma cruzi, which causes the Chagas disease; and moderately against Leishmania species, responsible for Leishmaniasis. In this investigation, several pulchrol analogues were prepared and assayed toward T. cruzi epimastigotes, and L. braziliensis and L. amazonensis promastigotes, to develop structure activity relationship studies (SARs). Analogues with transformations in the three rings of the pulchrol’s scaffold were prepared. Initially, compounds with transformations at the benzylic position in the A-ring were assayed to evaluate the role of the benzylic alcohol in pulchrol. The results showed that an hydrogen bond acceptor group is important for the antitrypanosomatid activity and that ester groups with bulky alkyl substituents increase the potency toward all parasites. Analogues with transformations in the B- and C-rings, were focused on the variation of lipophilicity. In the B-ring, the methyl substituents placed at position 6 in pulchrol were replaced for two hydrogen atoms, just one methyl substituent, or two longer alkyl substituents. The biological activity results showed that longer chains with less than four carbon atoms are benefitial for the activity. A methoxy subtituent is placed at position 2 in pulchrol’s C-ring, in this study, analogues with the methoxy subtituent placed in different positions or replaced with alkyl subtituents were prepared, the results showed that compounds with hydrophobic groups in the C-ring incresed the potency.Several analogues with more than one modification in different rings were also prepared. The combination of carbonyl groups in the A-ring with bulky alkyl groups in the C-ring was the most benefitial for the activity. In contrast, esters subtituted with a hydrophobic group in the A-ring and bulky alkyl groups in the C-ring hampered the activity. A hydrogen bond acceptor at the benzylic position in the A-ring, as well as an additional hydroxyl group at position 1 in the C-ring (as in cannabinol) appeared to be important for the activity. The combination of different functionalities also seemed to have and effect in the orientation of the molecule inside the target protein. Our results showed that differences between the active sites for the different parasites may exist, however, preliminary pharmacophore hypotheses based on our biological results showed that the main pharmacophoric features are two hydrogen bond acceptor groups (one at the benzylic position and one on the B-ring’s oxigen) and three hydrophobic features (two in the B-ring at position 6, and one in the C-ring at position 2 or 3).A qualitative evaluation of ADMET-descriptors calculated in silico, showed that most of the molecules have potential as orally administered substances, however, further studies focused on the development of compounds with more potency and focused on the optimization of the ADME characteristics are recommended

    Discovery and development of novel inhibitors for the kinase Pim-1 and G-Protein Coupled Receptor Smoothened

    Get PDF
    Investigation of the cause of disease is no easy business. This is particularly so when one reflects upon the lessons taught us in antiquity. Prior to the beginning of the last century, diagnosis and treatment of diseases such as cancers was so bereft of hope that there was little physicians could offer in the way of comfort, let alone treatment. One of the major insights from investigations into cancers this century has been that those involved in research leading to treatments are not dealing with a singular malady but multiple families of diseases with different mechanisms and modes of action. Therefore, despite the end game being similar in cancers, that of uncontrolled growth and replication leading to cellular dysfunction, different diseases require different approaches in targeting them. This leads us to a particular broad treatment approach, that of drug design. A drug is, in the classical sense, a small molecule that, upon introduction into the body, interacts with biochemical targets to induce a wider biological effect, ideally with both an intended target and intended effect. The conceptual basis underpinning this `lock-and-key' paradigm was elucidated over a century ago and the primary occupation of those involved in biochemical research has been to determine as much information as possible about both of these protein locks and drug keys. And, as inferred from the paradigm, molecular shape is all-important in determining and controlling action against the most important locks with the most potent and specific keys. The two most important target classes in drug discovery for some time have been protein kinases and G Protein-Coupled Receptors (GPCRs). Both classes of proteins are large families that perform very different tasks within the body. Kinases activate and inactive many cellular processes by catalysing the transfer of a phosphate group from Adenosine Tri-Phosphate (ATP) to other targets. GPCRs perform the job of interacting with chemical signals and communicating them into a biological response. Dysfunction in both types of proteins in certain cells can lead to a loss of biological control and, ultimately, a cancer. Both of kinases and GPCRs have entirely different chemical structures so structural knowledge therefore becomes crucial in any approach targeting cells where dysfunction has occurred. Thus, for this thesis, a member from each class was investigated using a combination of structural approaches. From the kinase class, the kinase Proviral Integration site for MuLV (Pim-1) and from the GPCR class, the cell membrane-bound Smoothened receptor (SMO). The kinase \pimone\ was the target of various approaches in \autoref{chap:three}. Although a heavily studied target from the mid-2000's, there is a paucity of inhibitors targeting residues more remote from structural characteristics that define kinases. Further limiting extension possibilities is that \pimone\ is constitutively active so no inhibitors targeting an inactive state are possible. An initial project (\pone) used the known binding properties of small molecules, or, `fragments' to elucidate structural and dynamic information useful for targeting \pimone. This was followed by three projects, all with the goal of inhibitor discovery, all with different foci. In \ptwo, fragment binding modes from \pone\ provided the basis for the extension and development of drug-like inhibitors with a focus on synthetic feasibility. In contrast, inhibitors were found in \pthree\ via a large-scale public dataset of purchasable molecules that possess drug-like properties. Finally, \pfour\ took the truncated form of a particularly attractive fragment from \pone\ that was crystallised with \pimone, verified its binding mode and then generated extensions with, again, a focus on synthetic feasibility. The GPCR \smo\ has fewer molecular studies and much about its structural behaviour remains unknown. As the most `druggable' protein in the Hedgehog pathway, structural studies have primarily focussed on stabilising its inactive state to prevent signal transduction. Allied to this is that there are generally few inhibitors for \smo\ and the drugs for cancers related to its dysfunction are vulnerable to mutations that significantly reduce their effectiveness or abrogate it entirely. The elucidation of structural information in therefore of high priority. An initial study attempting to identify an unknown molecule from prior experiments led to insights regarding binding characteristics of specific moieties. This was particularly important to understand not just where favourable moieties bind but also sections of the \smo\ binding pocket with unfavourable binding. In both subsequent virtual screens performed in Chapter 4, the primary aim was to find new drug-like inhibitors of \smo\ using large public datasets of commercially-available molecules. The initial screen retrieved relatively few inhibitors so the binding pocket was modified to find a structural state more amenable to small molecule binding. These modifications led to a significant number of new, chemically novel inhibitors for \smo, some structural information useful for future inhibitors and the elucidation of structure-activity relationships useful for inhibitor design. This underpins the idea that structural information is of critical importance in the discovery and design of molecular inhibitors

    DFFR: A New Method for High-Throughput Recalibration of Automatic Force-Fields for Drugs

    Get PDF
    We present drug force-field recalibration (DFFR), a new method for refining of automatic force-fields used to represent small drugs in docking and molecular dynamics simulations. The method is based on fine-tuning of torsional terms to obtain ensembles that reproduce observables derived from reference data. DFFR is fast and flexible and can be easily automatized for a high-throughput regime, making it useful in drug-design projects. We tested the performance of the method in a few model systems and also in a variety of druglike molecules using reference data derived from: (i) density functional theory coupled to a self-consistent reaction field (DFT/SCRF) calculations on highly populated conformers and (ii) enhanced sampling quantum mechanical/molecular mechanics (QM/MM) where the drug is reproduced at the QM level, while the solvent is represented by classical force-fields. Extension of the method to include other sources of reference data is discussed
    • 

    corecore