4,649 research outputs found

    Stereo-Aware Extension of HOSE Codes

    Get PDF
    The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.Descriptions of molecular environments have many applications in chemoinformatics, including chemical shift prediction. Hierarchically ordered spherical environment (HOSE) codes are the most popular such descriptions. We developed a method to extend these with stereochemistry information. It enables distinguishing atoms which would be considered identical in traditional HOSE codes. The use of our method is demonstrated by chemical shift predictions for molecules in the nmrshiftdb2 database. We give a full specification and an implementation

    Automated Identification and Classification of Stereochemistry: Chirality and Double Bond Stereoisomerism

    Full text link
    Stereoisomers have the same molecular formula and the same atom connectivity and their existence can be related to the presence of different three-dimensional arrangements. Stereoisomerism is of great importance in many different fields since the molecular properties and biological effects of the stereoisomers are often significantly different. Most drugs for example, are often composed of a single stereoisomer of a compound, and while one of them may have therapeutic effects on the body, another may be toxic. A challenging task is the automatic detection of stereoisomers using line input specifications such as SMILES or InChI since it requires information about group theory (to distinguish stereoisomers using mathematical information about its symmetry), topology and geometry of the molecule. There are several software packages that include modules to handle stereochemistry, especially the ones to name a chemical structure and/or view, edit and generate chemical structure diagrams. However, there is a lack of software capable of automatically analyzing a molecule represented as a graph and generate a classification of the type of isomerism present in a given atom or bond. Considering the importance of stereoisomerism when comparing chemical structures, this report describes a computer program for analyzing and processing steric information contained in a chemical structure represented as a molecular graph and providing as output a binary classification of the isomer type based on the recommended conventions. Due to the complexity of the underlying issue, specification of stereochemical information is currently limited to explicit stereochemistry and to the two most common types of stereochemistry caused by asymmetry around carbon atoms: chiral atom and double bond. A Webtool to automatically identify and classify stereochemistry is available at http://nams.lasige.di.fc.ul.pt/tools.ph

    Fundamental Molecules of Life are Pigments which Arose and Evolved to Dissipate the Solar Spectrum

    Full text link
    The driving force behind the origin and evolution of life has been the thermodynamic imperative of increasing the entropy production of the biosphere through increasing the global solar photon dissipation rate. In the upper atmosphere of today, oxygen and ozone derived from life processes are performing the short wavelength UVC and UVB dissipation. On Earth's surface, water and organic pigments in water facilitate the near UV and visible photon dissipation. The first organic pigments probably formed, absorbed, and dissipated at those photochemically active wavelengths in the UVC that could have reached Earth's surface during the Archean. Proliferation of these pigments can be understood as an autocatalytic photochemical process obeying non-equilibrium thermodynamic directives related to increasing solar photon dissipation rate. Under these directives, organic pigments would have evolved over time to increase the global photon dissipation rate by; 1) increasing the ratio of their effective photon cross sections to their physical size, 2) decreasing their electronic excited state life times, 3) quenching non-radiative de-excitation channels (e.g. fluorescence), 4) covering ever more completely the solar spectrum, and 5) dispersing into an ever greater surface area of Earth. From knowledge of the evolution of the spectrum of G-type stars, and considering the most probable history of the transparency of Earths atmosphere, we construct the most probable surface solar spectrum as a function of time and compare this with the history of molecular absorption maxima obtained from the available data in the literature. This comparison supports the thermodynamic dissipation theory for the origin of life, constrains models for Earth's early atmosphere, and sheds some new light on the origin of photosynthesis.Comment: 43 pages, 3 figure

    A treatment of stereochemistry in computer aided organic synthesis

    Get PDF
    This thesis describes the author’s contributions to a new stereochemical processing module constructed for the ARChem retrosynthesis program. The purpose of the module is to add the ability to perform enantioselective and diastereoselective retrosynthetic disconnections and generate appropriate precursor molecules. The module uses evidence based rules generated from a large database of literature reactions. Chapter 1 provides an introduction and critical review of the published body of work for computer aided synthesis design. The role of computer perception of key structural features (rings, functions groups etc.) and the construction and use of reaction transforms for generating precursors is discussed. Emphasis is also given to the application of strategies in retrosynthetic analysis. The availability of large reaction databases has enabled a new generation of retrosynthesis design programs to be developed that use automatically generated transforms assembled from published reactions. A brief description of the transform generation method employed by ARChem is given. Chapter 2 describes the algorithms devised by the author for handling the computer recognition and representation of the stereochemical features found in molecule and reaction scheme diagrams. The approach is generalised and uses flexible recognition patterns to transform information found in chemical diagrams into concise stereo descriptors for computer processing. An algorithm for efficiently comparing and classifying pairs of stereo descriptors is described. This algorithm is central for solving the stereochemical constraints in a variety of substructure matching problems addressed in chapter 3. The concise representation of reactions and transform rules as hyperstructure graphs is described. Chapter 3 is concerned with the efficient and reliable detection of stereochemical symmetry in both molecules, reactions and rules. A novel symmetry perception algorithm, based on a constraints satisfaction problem (CSP) solver, is described. The use of a CSP solver to implement an isomorph‐free matching algorithm for stereochemical substructure matching is detailed. The prime function of this algorithm is to seek out unique retron locations in target molecules and then to generate precursor molecules without duplications due to symmetry. Novel algorithms for classifying asymmetric, pseudo‐asymmetric and symmetric stereocentres; meso, centro, and C2 symmetric molecules; and the stereotopicity of trigonal (sp2) centres are described. Chapter 4 introduces and formalises the annotated structural language used to create both retrosynthetic rules and the patterns used for functional group recognition. A novel functional group recognition package is described along with its use to detect important electronic features such as electron‐withdrawing or donating groups and leaving groups. The functional groups and electronic features are used as constraints in retron rules to improve transform relevance. Chapter 5 details the approach taken to design detailed stereoselective and substrate controlled transforms from organised hierarchies of rules. The rules employ a rich set of constraints annotations that concisely describe the keying retrons. The application of the transforms for collating evidence based scoring parameters from published reaction examples is described. A survey of available reaction databases and the techniques for mining stereoselective reactions is demonstrated. A data mining tool was developed for finding the best reputable stereoselective reaction types for coding as transforms. For various reasons it was not possible during the research period to fully integrate this work with the ARChem program. Instead, Chapter 6 introduces a novel one‐step retrosynthesis module to test the developed transforms. The retrosynthesis algorithms use the organisation of the transform rule hierarchy to efficiently locate the best retron matches using all applicable stereoselective transforms. This module was tested using a small set of selected target molecules and the generated routes were ranked using a series of measured parameters including: stereocentre clearance and bond cleavage; example reputation; estimated stereoselectivity with reliability; and evidence of tolerated functional groups. In addition a method for detecting regioselectivity issues is presented. This work presents a number of algorithms using common set and graph theory operations and notations. Appendix A lists the set theory symbols and meanings. Appendix B summarises and defines the common graph theory terminology used throughout this thesis

    A biophysical basis for the emergence of the genetic code in protocells

    Get PDF
    The origin of the genetic code is an abiding mystery in biology. Hints of a 'code within the codons' suggest biophysical interactions, but these patterns have resisted interpretation. Here, we present a new framework, grounded in the autotrophic growth of protocells from CO2 and H2. Recent work suggests that the universal core of metabolism recapitulates a thermodynamically favoured protometabolism right up to nucleotide synthesis. Considering the genetic code in relation to an extended protometabolism allows us to predict most codon assignments. We show that the first letter of the codon corresponds to the distance from CO2 fixation, with amino acids encoded by the purines (G followed by A) being closest to CO2 fixation. These associations suggest a purine-rich early metabolism with a restricted pool of amino acids. The second position of the anticodon corresponds to the hydrophobicity of the amino acid encoded. We combine multiple measures of hydrophobicity to show that this correlation holds strongly for early amino acids but is weaker for later species. Finally, we demonstrate that redundancy at the third position is not randomly distributed around the code: non-redundant amino acids can be assigned based on size, specifically length. We attribute this to additional stereochemical interactions at the anticodon. These rules imply an iterative expansion of the genetic code over time with codon assignments depending on both distance from CO2 and biophysical interactions between nucleotide sequences and amino acids. In this way the earliest RNA polymers could produce non-random peptide sequences with selectable functions in autotrophic protocells

    Modular Chemical Descriptor Language (MCDL): Stereochemical modules

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In our previous papers we introduced the Modular Chemical Descriptor Language (MCDL) for providing a linear representation of chemical information. A subsequent development was the MCDL Java Chemical Structure Editor which is capable of drawing chemical structures from linear representations and generating MCDL descriptors from structures.</p> <p>Results</p> <p>In this paper we present MCDL modules and accompanying software that incorporate unique representation of molecular stereochemistry based on Cahn-Ingold-Prelog and Fischer ideas in constructing stereoisomer descriptors. The paper also contains additional discussions regarding canonical representation of stereochemical isomers, and brief algorithm descriptions of the open source LINDES, Java applet, and Open Babel MCDL processing module software packages.</p> <p>Conclusions</p> <p>Testing of the upgraded MCDL Java Chemical Structure Editor on compounds taken from several large and diverse chemical databases demonstrated satisfactory performance for storage and processing of stereochemical information in MCDL format.</p
    corecore