16 research outputs found
A treatment of stereochemistry in computer aided organic synthesis
This thesis describes the authorâs contributions to a new stereochemical processing module constructed for the ARChem retrosynthesis program. The purpose of the module is to add the ability to perform enantioselective and diastereoselective retrosynthetic disconnections and generate appropriate precursor molecules. The module uses evidence based rules generated from a large database of literature reactions.
Chapter 1 provides an introduction and critical review of the published body of work for computer aided synthesis design. The role of computer perception of key structural features (rings, functions groups etc.) and the construction and use of reaction transforms for generating precursors is discussed. Emphasis is also given to the application of strategies in retrosynthetic analysis. The availability of large reaction databases has enabled a new generation of retrosynthesis design programs to be developed that use automatically generated transforms assembled from published reactions. A brief description of the transform generation method employed by ARChem is given.
Chapter 2 describes the algorithms devised by the author for handling the computer recognition and representation of the stereochemical features found in molecule and reaction scheme diagrams. The approach is generalised and uses flexible recognition patterns to transform information found in chemical diagrams into concise stereo descriptors for computer processing. An algorithm for efficiently comparing and classifying pairs of stereo descriptors is described. This algorithm is central for solving the stereochemical constraints in a variety of substructure matching problems addressed in chapter 3. The concise representation of reactions and transform rules as hyperstructure graphs is described.
Chapter 3 is concerned with the efficient and reliable detection of stereochemical symmetry in both molecules, reactions and rules. A novel symmetry perception algorithm, based on a constraints satisfaction problem (CSP) solver, is described. The use of a CSP solver to implement an isomorphâfree matching algorithm for stereochemical substructure matching is detailed. The prime function of this algorithm is to seek out unique retron locations in target molecules and then to generate precursor molecules without duplications due to symmetry. Novel algorithms for classifying asymmetric, pseudoâasymmetric and symmetric stereocentres; meso, centro, and C2 symmetric molecules; and the stereotopicity of trigonal (sp2) centres are described.
Chapter 4 introduces and formalises the annotated structural language used to create both retrosynthetic rules and the patterns used for functional group recognition. A novel functional group recognition package is described along with its use to detect important electronic features such as electronâwithdrawing or donating groups and leaving groups. The functional groups and electronic features are used as constraints in retron rules to improve transform relevance.
Chapter 5 details the approach taken to design detailed stereoselective and substrate controlled transforms from organised hierarchies of rules. The rules employ a rich set of constraints annotations that concisely describe the keying retrons. The application of the transforms for collating evidence based scoring parameters from published reaction examples is described. A survey of available reaction databases and the techniques for mining stereoselective reactions is demonstrated. A data mining tool was developed for finding the best reputable stereoselective reaction types for coding as transforms.
For various reasons it was not possible during the research period to fully integrate this work with the ARChem program. Instead, Chapter 6 introduces a novel oneâstep retrosynthesis module to test the developed transforms. The retrosynthesis algorithms use the organisation of the transform rule hierarchy to efficiently locate the best retron matches using all applicable stereoselective transforms. This module was tested using a small set of selected target molecules and the generated routes were ranked using a series of measured parameters including: stereocentre clearance and bond cleavage; example reputation; estimated stereoselectivity with reliability; and evidence of tolerated functional groups. In addition a method for detecting regioselectivity issues is presented.
This work presents a number of algorithms using common set and graph theory operations and notations. Appendix A lists the set theory symbols and meanings. Appendix B summarises and defines the common graph theory terminology used throughout this thesis
Abstracts of the XL QUITEL Congress
Abstracts of the XL QUITEL Congres
Enhancing Reaction-based de novo Design using Machine Learning
De novo design is a branch of chemoinformatics that is concerned with the rational design of molecular structures with desired properties, which specifically aims at achieving suitable pharmacological and safety profiles when applied to drug design. Scoring, construction, and search methods are the main components that are exploited by de novo design programs to explore the chemical space to encourage the cost-effective design of new chemical entities. In particular, construction methods are concerned with providing strategies for compound generation to address issues such as drug-likeness and synthetic accessibility.
Reaction-based de novo design consists of combining building blocks according to transformation rules that are extracted from collections of known reactions, intending to restrict the enumerated chemical space into a manageable number of synthetically accessible structures. The reaction vector is an example of a representation that encodes topological changes occurring in reactions, which has been integrated within a structure generation algorithm to increase the chances of generating molecules that are synthesisable.
The general aim of this study was to enhance reaction-based de novo design by developing machine learning approaches that exploit publicly available data on reactions. A series of algorithms for reaction standardisation, fingerprinting, and reaction vector database validation were introduced and applied to generate new data on which the entirety of this work relies. First, these collections were applied to the validation of a new ligand-based design tool. The tool was then used in a case study to design compounds which were eventually synthesised using very similar procedures to those suggested by the structure generator.
A reaction classification model and a novel hierarchical labelling system were then developed to introduce the possibility of applying transformations by class. The model was augmented with an algorithm for confidence estimation, and was used to classify two datasets from industry and the literature. Results from the classification suggest that the model can be used effectively to gain insights on the nature of reaction collections.
Classified reactions were further processed to build a reaction class recommendation model capable of suggesting appropriate reaction classes to apply to molecules according to their fingerprints. The model was validated, then integrated within the reaction vector-based design framework, which was assessed on its performance against the baseline algorithm. Results from the de novo design experiments indicate that the use of the recommendation model leads to a higher synthetic accessibility and a more efficient management of computational resources
Studies toward the synthesis of salvinorin A
Salvinorin A [(2S,4aR,6aR,7R,9S,10aS,10bR)-9-(acetyloxy)-2-(3-furanyl)-dodecahydro-6a,10b-dimethyl-4,10-dioxo-2H-naptho[2,1-c]pyran-7-carboxylic acid methyl ester] is a trans-neoclerodane diterpene from the leaves of the hallucinogenic Mexican sage Salvia divinorum and has been identified as the principal psychoactive component in this plant of traditional spiritual importance. Salvinorin A is the most potent naturally occurring hallucinogen found so far and is reported to act selectively as a ÆĂ-opioid receptor agonist. Synthetic modification of the natural product has contributed to a number of proposed pharmacophores to identify the key structural features necessary for biological activity and a direct strategy for the asymmetric synthesis of the natural product is desirable since it allows access to a more diverse range of analogues. An ambitious retrosynthetic study of salvinorin A indicated the C(3)-heterosubstituted furan as an appropriate starting material for a Diels-Alder approach towards the ketone ring of the natural product. An expedient and high yielding methodology for the preparation of 3-furylamines is described, allowing the flexible introduction of alkyl substituents in the C(5) position. Optically pure ephedrine isomers have been explored as chiral amine auxiliaries and have been successfully attached as 3-furylamine substituents using the general methodology described. The 3-furylamines are electron rich dienes that are highly reactive towards Diels-Alder cycloaddition reactions with methyl acrylate. Diastereoisomers of the 7-oxanorbornane species methyl 1-methyl-5-oxo-7-oxa-bicyclo[2.2.1]heptane-2-carboxylate were prepared as new compounds from the hydrolysis of Diels-Alder cycloadducts and are functionalised bicyclic intermediates to access the ketone of the natural product. Diels-Alder reactions between the non-racemic (2S)-ephedrine-derived furans and methyl acrylate gave spiro-oxazolidine adducts that underwent hydrolysis to give the desired ketone. X-ray crystallography data for the derivatised cycloadduct established diastereoselectivity in favor of the (1S,4S)-enantiomer, as desired for the asymmetric natural product synthesis. A procedure for the ether cleavage of methyl 1-methyl-5-oxo-7-oxa-bicyclo[2.2.1]heptane-2-carboxylate was required to access the convergent precursor methyl 5-acetoxy-2-methyl-4-oxocyclohex-2-enecarboxylate. Successful C-O cleavage was achieved using Lewis-acid catalysis with BBr3 followed by mixing with the hindered base 2,4,6-collidine to yield methyl 5-hydroxy-2-methyl-4-oxocyclohex-2-enecarboxylate albeit only at high dilution. Acetylation proceeded in excellent yield in the same reaction vessel to give methyl 1-methyl-5-oxo-7-oxa-bicyclo[2.2.1]heptane-2-carboxylate in excellent yield. The devised synthetic pathway is shown to successfully construct the ketone ring of salvinorin A and stereoselectivity for the (1S,4S)-enantiomer can be achieved using the ephedrine derived furans as desired for the asymmetric natural product synthesis. The ÆĂ-lactone ring 6-(furan-3-yl)-5,6-dihydro-4-methyl-3-vinylpyran-2-one was derived from rudimentary precursors as a convergent reagent to introduce the lactone ring of salvinorin A. A short synthesis for the racemic compound is described starting from the aldol reaction between 3-furaldehyde and acetone to give the 3-furfurol, 4-(furan-3-yl)-4-hydroxybutan-2-one in quantitative yield. The 3-furfurol was reacted to form the ÆĂ-bromovinyl ester, 1-(furan-3-yl)-3-oxobutyl 2-bromobut-3-enoate using a deconjugation/esterification protocol with 2-bromobut-3-enoyl chloride. Intramolecular ring closure to the ÆĂ-lactone was achieved using a Reformatsky reaction and dehydration under acidic conditions yielded the racemic convergent precursor 6-(furan-3-yl)-5,6-dihydro-4-methyl-3-vinylpyran-2-one in high yield. A possible strategy for joining the ketone and lactone fragments for the total synthesis of salvinorin A is proposed
Recommended from our members
Chemical Information Bulletin
Created as a supplement for "the regular journals of the American Chemical Society," this publication contains annotated bibliographies of chemical documentation literature as well as information about meetings, conferences, awards, scholarships, and other news from the American Chemical Society (ACS) Division of Chemical Information (CINF)
The computer storage, retrieval and searching of generic structures in chemical patents : the machine-readable representation of generic structures.
The nature of the generic chemical structures found in patents is
described, with a discussion of the types of statement commonly
found in them. The available representations for such structures
are reviewed, with particular note being given to the suitability
of the representation for searching files of such structures.
Requirements for the unambiguous representation of generic
structures in an "ideal" storage and retrieval system are
discussed.
The basic principles of the theory of formal languages are
reviewed, with particular consideration being given to parsing
methods for context-free languages. The Grammar and parsing of
computer programming languages, as an example of artificial
formal languages, is discussed. Applications of formal language
theory to chemistry and information work are briefly reviewed.
GENSAL, a formal language for the unambiguous description of
generic structures from patents, is presented. It is designed to
be intelligible to a chemist or patent agent, yet sufficiently
ABSTRACT
formaLised to be amenabLe to computer anaLysis. DetaiLed
description is given of the facilities it provides for generic
structure representation, and there is discussion of its
Limitations and the principLes behind its design.
A connection-tabLe-based internaL representation for generic
structures, caLLed an ECTR <Extended Connection TabLe
Representation) is presented. It is designed to represent generic
structures unambiguousLy, and to be generated automatically from
structures encoded in GENSAL. It is compared to other proposed
representations, and its implementation using data types of the
programming Language PascaL described.
An interpreter program which generates an ECTR from structures
encoded in a subset of the GENSAL Language is presented. The
principles of its operation are described.
Possible applications of GENSAL outside the area of patent
documentation are discussed, and suggestions made for further
work on the development of a generic structure storage and
retrieval system based on GENSAL and ECTRs
Non-covalent interactions in organotin(IV) derivatives of 5,7-ditertbutyl- and 5,7-diphenyl-1,2,4-triazolo[1,5-a]pyrimidine as recognition motifs in crystalline self- assembly and their in vitro antistaphylococcal activity
Non-covalent interactions are known to play a key role in biological compounds due to their
stabilization of the tertiary and quaternary structure of proteins [1]. Ligands similar to purine rings,
such as triazolo pyrimidine ones, are very versatile in their interactions with metals and can act as
model systems for natural bio-inorganic compounds [2]. A considerable series (twelve novel
compounds are reported) of 5,7-ditertbutyl-1,2,4-triazolo[1,5-a]pyrimidine (dbtp) and 5,7-diphenyl-
1,2,4-triazolo[1,5-a]pyrimidine (dptp) were synthesized and investigated by FT-IR and 119Sn
M\uf6ssbauer in the solid state and by 1H and 13C NMR spectroscopy, in solution [3]. The X-ray
crystal and molecular structures of Et2SnCl2(dbtp)2 and Ph2SnCl2(EtOH)2(dptp)2 were described, in
this latter pyrimidine molecules are not directly bound to the metal center but strictly H-bonded,
through N(3), to the -OH group of the ethanol moieties. The network of hydrogen bonding and
aromatic interactions involving pyrimidine and phenyl
rings in both complexes drives their self-assembly. Noncovalent
interactions involving aromatic rings are key
processes in both chemical and biological recognition,
contributing to overall complex stability and forming
recognition motifs. It is noteworthy that in
Ph2SnCl2(EtOH)2(dptp)2 \u3c0\u2013\u3c0 stacking interactions between
pairs of antiparallel triazolopyrimidine rings mimick basepair
interactions physiologically occurring in DNA (Fig.1).
M\uf6ssbauer spectra suggest for Et2SnCl2(dbtp)2 a
distorted octahedral structure, with C-Sn-C bond angles
lower than 180\ub0. The estimated angle for Et2SnCl2(dbtp)2
is virtually identical to that determined by X-ray diffraction. Ph2SnCl2(EtOH)2(dptp)2 is
characterized by an essentially linear C-Sn-C fragment according to the X-ray all-trans structure.
The compounds were screened for their in vitro antibacterial activity on a group of reference
staphylococcal strains susceptible or resistant to methicillin and against two reference Gramnegative
pathogens [4] . We tested the biological activity of all the specimen against a group of
staphylococcal reference strains (S. aureus ATCC 25923, S. aureus ATCC 29213, methicillin
resistant S. aureus 43866 and S. epidermidis RP62A) along with Gram-negative pathogens (P.
aeruginosa ATCC9027 and E. coli ATCC25922). Ph2SnCl2(EtOH)2(dptp)2 showed good
antibacterial activity with a MIC value of 5 \u3bcg mL-1 against S. aureus ATCC29213 and also
resulted active against methicillin resistant S. epidermidis RP62A