16 research outputs found

    A treatment of stereochemistry in computer aided organic synthesis

    Get PDF
    This thesis describes the author’s contributions to a new stereochemical processing module constructed for the ARChem retrosynthesis program. The purpose of the module is to add the ability to perform enantioselective and diastereoselective retrosynthetic disconnections and generate appropriate precursor molecules. The module uses evidence based rules generated from a large database of literature reactions. Chapter 1 provides an introduction and critical review of the published body of work for computer aided synthesis design. The role of computer perception of key structural features (rings, functions groups etc.) and the construction and use of reaction transforms for generating precursors is discussed. Emphasis is also given to the application of strategies in retrosynthetic analysis. The availability of large reaction databases has enabled a new generation of retrosynthesis design programs to be developed that use automatically generated transforms assembled from published reactions. A brief description of the transform generation method employed by ARChem is given. Chapter 2 describes the algorithms devised by the author for handling the computer recognition and representation of the stereochemical features found in molecule and reaction scheme diagrams. The approach is generalised and uses flexible recognition patterns to transform information found in chemical diagrams into concise stereo descriptors for computer processing. An algorithm for efficiently comparing and classifying pairs of stereo descriptors is described. This algorithm is central for solving the stereochemical constraints in a variety of substructure matching problems addressed in chapter 3. The concise representation of reactions and transform rules as hyperstructure graphs is described. Chapter 3 is concerned with the efficient and reliable detection of stereochemical symmetry in both molecules, reactions and rules. A novel symmetry perception algorithm, based on a constraints satisfaction problem (CSP) solver, is described. The use of a CSP solver to implement an isomorph‐free matching algorithm for stereochemical substructure matching is detailed. The prime function of this algorithm is to seek out unique retron locations in target molecules and then to generate precursor molecules without duplications due to symmetry. Novel algorithms for classifying asymmetric, pseudo‐asymmetric and symmetric stereocentres; meso, centro, and C2 symmetric molecules; and the stereotopicity of trigonal (sp2) centres are described. Chapter 4 introduces and formalises the annotated structural language used to create both retrosynthetic rules and the patterns used for functional group recognition. A novel functional group recognition package is described along with its use to detect important electronic features such as electron‐withdrawing or donating groups and leaving groups. The functional groups and electronic features are used as constraints in retron rules to improve transform relevance. Chapter 5 details the approach taken to design detailed stereoselective and substrate controlled transforms from organised hierarchies of rules. The rules employ a rich set of constraints annotations that concisely describe the keying retrons. The application of the transforms for collating evidence based scoring parameters from published reaction examples is described. A survey of available reaction databases and the techniques for mining stereoselective reactions is demonstrated. A data mining tool was developed for finding the best reputable stereoselective reaction types for coding as transforms. For various reasons it was not possible during the research period to fully integrate this work with the ARChem program. Instead, Chapter 6 introduces a novel one‐step retrosynthesis module to test the developed transforms. The retrosynthesis algorithms use the organisation of the transform rule hierarchy to efficiently locate the best retron matches using all applicable stereoselective transforms. This module was tested using a small set of selected target molecules and the generated routes were ranked using a series of measured parameters including: stereocentre clearance and bond cleavage; example reputation; estimated stereoselectivity with reliability; and evidence of tolerated functional groups. In addition a method for detecting regioselectivity issues is presented. This work presents a number of algorithms using common set and graph theory operations and notations. Appendix A lists the set theory symbols and meanings. Appendix B summarises and defines the common graph theory terminology used throughout this thesis

    Abstracts of the XL QUITEL Congress

    Get PDF
    Abstracts of the XL QUITEL Congres

    Enhancing Reaction-based de novo Design using Machine Learning

    Get PDF
    De novo design is a branch of chemoinformatics that is concerned with the rational design of molecular structures with desired properties, which specifically aims at achieving suitable pharmacological and safety profiles when applied to drug design. Scoring, construction, and search methods are the main components that are exploited by de novo design programs to explore the chemical space to encourage the cost-effective design of new chemical entities. In particular, construction methods are concerned with providing strategies for compound generation to address issues such as drug-likeness and synthetic accessibility. Reaction-based de novo design consists of combining building blocks according to transformation rules that are extracted from collections of known reactions, intending to restrict the enumerated chemical space into a manageable number of synthetically accessible structures. The reaction vector is an example of a representation that encodes topological changes occurring in reactions, which has been integrated within a structure generation algorithm to increase the chances of generating molecules that are synthesisable. The general aim of this study was to enhance reaction-based de novo design by developing machine learning approaches that exploit publicly available data on reactions. A series of algorithms for reaction standardisation, fingerprinting, and reaction vector database validation were introduced and applied to generate new data on which the entirety of this work relies. First, these collections were applied to the validation of a new ligand-based design tool. The tool was then used in a case study to design compounds which were eventually synthesised using very similar procedures to those suggested by the structure generator. A reaction classification model and a novel hierarchical labelling system were then developed to introduce the possibility of applying transformations by class. The model was augmented with an algorithm for confidence estimation, and was used to classify two datasets from industry and the literature. Results from the classification suggest that the model can be used effectively to gain insights on the nature of reaction collections. Classified reactions were further processed to build a reaction class recommendation model capable of suggesting appropriate reaction classes to apply to molecules according to their fingerprints. The model was validated, then integrated within the reaction vector-based design framework, which was assessed on its performance against the baseline algorithm. Results from the de novo design experiments indicate that the use of the recommendation model leads to a higher synthetic accessibility and a more efficient management of computational resources

    Studies toward the synthesis of salvinorin A

    Get PDF
    Salvinorin A [(2S,4aR,6aR,7R,9S,10aS,10bR)-9-(acetyloxy)-2-(3-furanyl)-dodecahydro-6a,10b-dimethyl-4,10-dioxo-2H-naptho[2,1-c]pyran-7-carboxylic acid methyl ester] is a trans-neoclerodane diterpene from the leaves of the hallucinogenic Mexican sage Salvia divinorum and has been identified as the principal psychoactive component in this plant of traditional spiritual importance. Salvinorin A is the most potent naturally occurring hallucinogen found so far and is reported to act selectively as a Æ’Ă›-opioid receptor agonist. Synthetic modification of the natural product has contributed to a number of proposed pharmacophores to identify the key structural features necessary for biological activity and a direct strategy for the asymmetric synthesis of the natural product is desirable since it allows access to a more diverse range of analogues. An ambitious retrosynthetic study of salvinorin A indicated the C(3)-heterosubstituted furan as an appropriate starting material for a Diels-Alder approach towards the ketone ring of the natural product. An expedient and high yielding methodology for the preparation of 3-furylamines is described, allowing the flexible introduction of alkyl substituents in the C(5) position. Optically pure ephedrine isomers have been explored as chiral amine auxiliaries and have been successfully attached as 3-furylamine substituents using the general methodology described. The 3-furylamines are electron rich dienes that are highly reactive towards Diels-Alder cycloaddition reactions with methyl acrylate. Diastereoisomers of the 7-oxanorbornane species methyl 1-methyl-5-oxo-7-oxa-bicyclo[2.2.1]heptane-2-carboxylate were prepared as new compounds from the hydrolysis of Diels-Alder cycloadducts and are functionalised bicyclic intermediates to access the ketone of the natural product. Diels-Alder reactions between the non-racemic (2S)-ephedrine-derived furans and methyl acrylate gave spiro-oxazolidine adducts that underwent hydrolysis to give the desired ketone. X-ray crystallography data for the derivatised cycloadduct established diastereoselectivity in favor of the (1S,4S)-enantiomer, as desired for the asymmetric natural product synthesis. A procedure for the ether cleavage of methyl 1-methyl-5-oxo-7-oxa-bicyclo[2.2.1]heptane-2-carboxylate was required to access the convergent precursor methyl 5-acetoxy-2-methyl-4-oxocyclohex-2-enecarboxylate. Successful C-O cleavage was achieved using Lewis-acid catalysis with BBr3 followed by mixing with the hindered base 2,4,6-collidine to yield methyl 5-hydroxy-2-methyl-4-oxocyclohex-2-enecarboxylate albeit only at high dilution. Acetylation proceeded in excellent yield in the same reaction vessel to give methyl 1-methyl-5-oxo-7-oxa-bicyclo[2.2.1]heptane-2-carboxylate in excellent yield. The devised synthetic pathway is shown to successfully construct the ketone ring of salvinorin A and stereoselectivity for the (1S,4S)-enantiomer can be achieved using the ephedrine derived furans as desired for the asymmetric natural product synthesis. The Æ’Ă”-lactone ring 6-(furan-3-yl)-5,6-dihydro-4-methyl-3-vinylpyran-2-one was derived from rudimentary precursors as a convergent reagent to introduce the lactone ring of salvinorin A. A short synthesis for the racemic compound is described starting from the aldol reaction between 3-furaldehyde and acetone to give the 3-furfurol, 4-(furan-3-yl)-4-hydroxybutan-2-one in quantitative yield. The 3-furfurol was reacted to form the Æ’Ă‘-bromovinyl ester, 1-(furan-3-yl)-3-oxobutyl 2-bromobut-3-enoate using a deconjugation/esterification protocol with 2-bromobut-3-enoyl chloride. Intramolecular ring closure to the Æ’Ă”-lactone was achieved using a Reformatsky reaction and dehydration under acidic conditions yielded the racemic convergent precursor 6-(furan-3-yl)-5,6-dihydro-4-methyl-3-vinylpyran-2-one in high yield. A possible strategy for joining the ketone and lactone fragments for the total synthesis of salvinorin A is proposed

    Surface Water Photochemistry

    Get PDF

    The computer storage, retrieval and searching of generic structures in chemical patents : the machine-readable representation of generic structures.

    Get PDF
    The nature of the generic chemical structures found in patents is described, with a discussion of the types of statement commonly found in them. The available representations for such structures are reviewed, with particular note being given to the suitability of the representation for searching files of such structures. Requirements for the unambiguous representation of generic structures in an "ideal" storage and retrieval system are discussed. The basic principles of the theory of formal languages are reviewed, with particular consideration being given to parsing methods for context-free languages. The Grammar and parsing of computer programming languages, as an example of artificial formal languages, is discussed. Applications of formal language theory to chemistry and information work are briefly reviewed. GENSAL, a formal language for the unambiguous description of generic structures from patents, is presented. It is designed to be intelligible to a chemist or patent agent, yet sufficiently ABSTRACT formaLised to be amenabLe to computer anaLysis. DetaiLed description is given of the facilities it provides for generic structure representation, and there is discussion of its Limitations and the principLes behind its design. A connection-tabLe-based internaL representation for generic structures, caLLed an ECTR <Extended Connection TabLe Representation) is presented. It is designed to represent generic structures unambiguousLy, and to be generated automatically from structures encoded in GENSAL. It is compared to other proposed representations, and its implementation using data types of the programming Language PascaL described. An interpreter program which generates an ECTR from structures encoded in a subset of the GENSAL Language is presented. The principles of its operation are described. Possible applications of GENSAL outside the area of patent documentation are discussed, and suggestions made for further work on the development of a generic structure storage and retrieval system based on GENSAL and ECTRs

    Non-covalent interactions in organotin(IV) derivatives of 5,7-ditertbutyl- and 5,7-diphenyl-1,2,4-triazolo[1,5-a]pyrimidine as recognition motifs in crystalline self- assembly and their in vitro antistaphylococcal activity

    Get PDF
    Non-covalent interactions are known to play a key role in biological compounds due to their stabilization of the tertiary and quaternary structure of proteins [1]. Ligands similar to purine rings, such as triazolo pyrimidine ones, are very versatile in their interactions with metals and can act as model systems for natural bio-inorganic compounds [2]. A considerable series (twelve novel compounds are reported) of 5,7-ditertbutyl-1,2,4-triazolo[1,5-a]pyrimidine (dbtp) and 5,7-diphenyl- 1,2,4-triazolo[1,5-a]pyrimidine (dptp) were synthesized and investigated by FT-IR and 119Sn M\uf6ssbauer in the solid state and by 1H and 13C NMR spectroscopy, in solution [3]. The X-ray crystal and molecular structures of Et2SnCl2(dbtp)2 and Ph2SnCl2(EtOH)2(dptp)2 were described, in this latter pyrimidine molecules are not directly bound to the metal center but strictly H-bonded, through N(3), to the -OH group of the ethanol moieties. The network of hydrogen bonding and aromatic interactions involving pyrimidine and phenyl rings in both complexes drives their self-assembly. Noncovalent interactions involving aromatic rings are key processes in both chemical and biological recognition, contributing to overall complex stability and forming recognition motifs. It is noteworthy that in Ph2SnCl2(EtOH)2(dptp)2 \u3c0\u2013\u3c0 stacking interactions between pairs of antiparallel triazolopyrimidine rings mimick basepair interactions physiologically occurring in DNA (Fig.1). M\uf6ssbauer spectra suggest for Et2SnCl2(dbtp)2 a distorted octahedral structure, with C-Sn-C bond angles lower than 180\ub0. The estimated angle for Et2SnCl2(dbtp)2 is virtually identical to that determined by X-ray diffraction. Ph2SnCl2(EtOH)2(dptp)2 is characterized by an essentially linear C-Sn-C fragment according to the X-ray all-trans structure. The compounds were screened for their in vitro antibacterial activity on a group of reference staphylococcal strains susceptible or resistant to methicillin and against two reference Gramnegative pathogens [4] . We tested the biological activity of all the specimen against a group of staphylococcal reference strains (S. aureus ATCC 25923, S. aureus ATCC 29213, methicillin resistant S. aureus 43866 and S. epidermidis RP62A) along with Gram-negative pathogens (P. aeruginosa ATCC9027 and E. coli ATCC25922). Ph2SnCl2(EtOH)2(dptp)2 showed good antibacterial activity with a MIC value of 5 \u3bcg mL-1 against S. aureus ATCC29213 and also resulted active against methicillin resistant S. epidermidis RP62A
    corecore