2,418 research outputs found

    Kinetic model construction using chemoinformatics

    Get PDF
    Kinetic models of chemical processes not only provide an alternative to costly experiments; they also have the potential to accelerate the pace of innovation in developing new chemical processes or in improving existing ones. Kinetic models are most powerful when they reflect the underlying chemistry by incorporating elementary pathways between individual molecules. The downside of this high level of detail is that the complexity and size of the models also steadily increase, such that the models eventually become too difficult to be manually constructed. Instead, computers are programmed to automate the construction of these models, and make use of graph theory to translate chemical entities such as molecules and reactions into computer-understandable representations. This work studies the use of automated methods to construct kinetic models. More particularly, the need to account for the three-dimensional arrangement of atoms in molecules and reactions of kinetic models is investigated and illustrated by two case studies. First of all, the thermal rearrangement of two monoterpenoids, cis- and trans-2-pinanol, is studied. A kinetic model that accounts for the differences in reactivity and selectivity of both pinanol diastereomers is proposed. Secondly, a kinetic model for the pyrolysis of the fuel “JP-10” is constructed and highlights the use of state-of-the-art techniques for the automated estimation of thermochemistry of polycyclic molecules. A new code is developed for the automated construction of kinetic models and takes advantage of the advances made in the field of chemo-informatics to tackle fundamental issues of previous approaches. Novel algorithms are developed for three important aspects of automated construction of kinetic models: the estimation of symmetry of molecules and reactions, the incorporation of stereochemistry in kinetic models, and the estimation of thermochemical and kinetic data using scalable structure-property methods. Finally, the application of the code is illustrated by the automated construction of a kinetic model for alkylsulfide pyrolysis

    Preferential attachment during the evolution of a potential energy landscape

    Full text link
    It has previously been shown that the network of connected minima on a potential energy landscape is scale-free, and that this reflects a power-law distribution for the areas of the basins of attraction surrounding the minima. Here, we set out to understand more about the physical origins of these puzzling properties by examining how the potential energy landscape of a 13-atom cluster evolves with the range of the potential. In particular, on decreasing the range of the potential the number of stationary points increases and thus the landscape becomes rougher and the network gets larger. Thus, we are able to follow the evolution of the potential energy landscape from one with just a single minimum to a complex landscape with many minima and a scale-free pattern of connections. We find that during this growth process, new edges in the network of connected minima preferentially attach to more highly-connected minima, thus leading to the scale-free character. Furthermore, minima that appear when the range of the potential is shorter and the network is larger have smaller basins of attraction. As there are many of these smaller basins because the network grows exponentially, the observed growth process thus also gives rise to a power-law distribution for the hyperareas of the basins.Comment: 10 pages, 10 figure

    Two essays in computational optimization: computing the clar number in fullerene graphs and distributing the errors in iterative interior point methods

    Get PDF
    Fullerene are cage-like hollow carbon molecules graph of pseudospherical sym- metry consisting of only pentagons and hexagons faces. It has been the object of interest for chemists and mathematicians due to its widespread application in various fields, namely including electronic and optic engineering, medical sci- ence and biotechnology. A Fullerene molecular, Γ n of n atoms has a multiplicity of isomers which increases as N iso ∼ O(n 9 ). For instance, Γ 180 has 79,538,751 isomers. The Fries and Clar numbers are stability predictors of a Fullerene molecule. These number can be computed by solving a (possibly N P -hard) combinatorial optimization problem. We propose several ILP formulation of such a problem each yielding a solution algorithm that provides the exact value of the Fries and Clar numbers. We compare the performances of the algorithm derived from the proposed ILP formulations. One of this algorithm is used to find the Clar isomers, i.e., those for which the Clar number is maximum among all isomers having a given size. We repeated this computational experiment for all sizes up to 204 atoms. In the course of the study a total of 2 649 413 774 isomers were analyzed.The second essay concerns developing an iterative primal dual infeasible path following (PDIPF) interior point (IP) algorithm for separable convex quadratic minimum cost flow network problem. In each iteration of PDIPF algorithm, the main computational effort is solving the underlying Newton search direction system. We concentrated on finding the solution of the corresponding linear system iteratively and inexactly. We assumed that all the involved inequalities can be solved inexactly and to this purpose, we focused on different approaches for distributing the error generated by iterative linear solvers such that the convergences of the PDIPF algorithm are guaranteed. As a result, we achieved theoretical bases that open the path to further interesting practical investiga- tion

    Impact of noise on inverse design: The case of NMR spectra matching

    Full text link
    Despite its fundamental importance and widespread use for assessing reaction success in organic chemistry, deducing chemical structures from nuclear magnetic resonance (NMR) measurements has remained largely manual and time consuming. To keep up with the accelerated pace of automated synthesis in self driving laboratory settings, robust computational algorithms are needed to rapidly perform structure elucidations. We analyse the effectiveness of solving the NMR spectra matching task encountered in this inverse structure elucidation problem by systematically constraining the chemical search space, and correspondingly reducing the ambiguity of the matching task. Numerical evidence collected for the twenty most common stoichiometries in the QM9-NMR data base indicate systematic trends of more permissible machine learning prediction errors in constrained search spaces. Results suggest that compounds with multiple heteroatoms are harder to characterize than others. Extending QM9 by \sim10 times more constitutional isomers with 3D structures generated by Surge, ETKDG and CREST, we used ML models of chemical shifts trained on the QM9-NMR data to test the spectra matching algorithms. Combining both 13C^{13}\mathrm{C} and 1H^{1}\mathrm{H} shifts in the matching process suggests twice as permissible machine learning prediction errors than for matching based on 13C^{13}\mathrm{C} shifts alone. Performance curves demonstrate that reducing ambiguity and search space can decrease machine learning training data needs by orders of magnitude

    Software platform virtualization in chemistry research and university teaching

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Modern chemistry laboratories operate with a wide range of software applications under different operating systems, such as Windows, LINUX or Mac OS X. Instead of installing software on different computers it is possible to install those applications on a single computer using Virtual Machine software. Software platform virtualization allows a single guest operating system to execute multiple other operating systems on the same computer. We apply and discuss the use of virtual machines in chemistry research and teaching laboratories.</p> <p>Results</p> <p>Virtual machines are commonly used for cheminformatics software development and testing. Benchmarking multiple chemistry software packages we have confirmed that the computational speed penalty for using virtual machines is low and around 5% to 10%. Software virtualization in a teaching environment allows faster deployment and easy use of commercial and open source software in hands-on computer teaching labs.</p> <p>Conclusion</p> <p>Software virtualization in chemistry, mass spectrometry and cheminformatics is needed for software testing and development of software for different operating systems. In order to obtain maximum performance the virtualization software should be multi-core enabled and allow the use of multiprocessor configurations in the virtual machine environment. Server consolidation, by running multiple tasks and operating systems on a single physical machine, can lead to lower maintenance and hardware costs especially in small research labs. The use of virtual machines can prevent software virus infections and security breaches when used as a sandbox system for internet access and software testing. Complex software setups can be created with virtual machines and are easily deployed later to multiple computers for hands-on teaching classes. We discuss the popularity of bioinformatics compared to cheminformatics as well as the missing cheminformatics education at universities worldwide.</p
    corecore