Search CORE

2,418 research outputs found

Recommended from our members

Complex macrocycle exploration: parallel, heuristic, and constraint-based conformer generation using ForceGen.

Author: Cleves Ann E
Gao Qi
Jain Ajay N
Liu Yizhou
Reibarkh Mikhail Y
Sherer Edward C
Wang Xiao
Publication venue: eScholarship, University of California
Publication date: 01/06/2019
Field of study

ForceGen is a template-free, non-stochastic approach for 2D to 3D structure generation and conformational elaboration for small molecules, including both non-macrocycles and macrocycles. For conformational search of non-macrocycles, ForceGen is both faster and more accurate than the best of all tested methods on a very large, independently curated benchmark of 2859 PDB ligands. In this study, the primary results are on macrocycles, including results for 431 unique examples from four separate benchmarks. These include complex peptide and peptide-like cases that can form networks of internal hydrogen bonds. By making use of new physical movements ("flips" of near-linear sub-cycles and explicit formation of hydrogen bonds), ForceGen exhibited statistically significantly better performance for overall RMS deviation from experimental coordinates than all other approaches. The algorithmic approach offers natural parallelization across multiple computing-cores. On a modest multi-core workstation, for all but the most complex macrocycles, median wall-clock times were generally under a minute in fast search mode and under 2 min using thorough search. On the most complex cases (roughly cyclic decapeptides and larger) explicit exploration of likely hydrogen bonding networks yielded marked improvements, but with calculation times increasing to several minutes and in some cases to roughly an hour for fast search. In complex cases, utilization of NMR data to constrain conformational search produces accurate conformational ensembles representative of solution state macrocycle behavior. On macrocycles of typical complexity (up to 21 rotatable macrocyclic and exocyclic bonds), design-focused macrocycle optimization can be practically supported by computational chemistry at interactive time-scales, with conformational ensemble accuracy equaling what is seen with non-macrocyclic ligands. For more complex macrocycles, inclusion of sparse biophysical data is a helpful adjunct to computation

eScholarship - University of California

Structure Identification of Novel Compounds using simple IR, 1H and 13C NMR spectroscopy and Computational Tools

Author: Berger J.
Crews P.
Field L. D.
Frisch M. J.
Meiler J.
Publication venue: 'Wiley'
Publication date: 01/01/2020
Field of study

Crossref

Roskilde Universitet

Kinetic model construction using chemoinformatics

Author: Vandewiele Nick
Publication venue: Ghent University. Faculty of Engineering and Architecture
Publication date: 01/01/2014
Field of study

Kinetic models of chemical processes not only provide an alternative to costly experiments; they also have the potential to accelerate the pace of innovation in developing new chemical processes or in improving existing ones. Kinetic models are most powerful when they reflect the underlying chemistry by incorporating elementary pathways between individual molecules. The downside of this high level of detail is that the complexity and size of the models also steadily increase, such that the models eventually become too difficult to be manually constructed. Instead, computers are programmed to automate the construction of these models, and make use of graph theory to translate chemical entities such as molecules and reactions into computer-understandable representations. This work studies the use of automated methods to construct kinetic models. More particularly, the need to account for the three-dimensional arrangement of atoms in molecules and reactions of kinetic models is investigated and illustrated by two case studies. First of all, the thermal rearrangement of two monoterpenoids, cis- and trans-2-pinanol, is studied. A kinetic model that accounts for the differences in reactivity and selectivity of both pinanol diastereomers is proposed. Secondly, a kinetic model for the pyrolysis of the fuel “JP-10” is constructed and highlights the use of state-of-the-art techniques for the automated estimation of thermochemistry of polycyclic molecules. A new code is developed for the automated construction of kinetic models and takes advantage of the advances made in the field of chemo-informatics to tackle fundamental issues of previous approaches. Novel algorithms are developed for three important aspects of automated construction of kinetic models: the estimation of symmetry of molecules and reactions, the incorporation of stereochemistry in kinetic models, and the estimation of thermochemical and kinetic data using scalable structure-property methods. Finally, the application of the code is illustrated by the automated construction of a kinetic model for alkylsulfide pyrolysis

Ghent University Academic Bibliography

Preferential attachment during the evolution of a potential energy landscape

Author: Claire P. Massen
Gilmore R.
Jonathan P. K. Doye
Roth C.
Publication venue: 'AIP Publishing'
Publication date: 01/01/2007
Field of study

It has previously been shown that the network of connected minima on a potential energy landscape is scale-free, and that this reflects a power-law distribution for the areas of the basins of attraction surrounding the minima. Here, we set out to understand more about the physical origins of these puzzling properties by examining how the potential energy landscape of a 13-atom cluster evolves with the range of the potential. In particular, on decreasing the range of the potential the number of stationary points increases and thus the landscape becomes rougher and the network gets larger. Thus, we are able to follow the evolution of the potential energy landscape from one with just a single minimum to a complex landscape with many minima and a scale-free pattern of connections. We find that during this growth process, new edges in the network of connected minima preferentially attach to more highly-connected minima, thus leading to the scale-free character. Furthermore, minima that appear when the range of the potential is shorter and the network is larger have smaller basins of attraction. As there are many of these smaller basins because the network grows exponentially, the observed growth process thus also gives rise to a power-law distribution for the hyperareas of the basins.Comment: 10 pages, 10 figure

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

Two essays in computational optimization: computing the clar number in fullerene graphs and distributing the errors in iterative interior point methods

Author: SALAMI MARYAM
Publication venue
Publication date: 17/09/2018
Field of study

Fullerene are cage-like hollow carbon molecules graph of pseudospherical sym- metry consisting of only pentagons and hexagons faces. It has been the object of interest for chemists and mathematicians due to its widespread application in various fields, namely including electronic and optic engineering, medical sci- ence and biotechnology. A Fullerene molecular, Γ n of n atoms has a multiplicity of isomers which increases as N iso ∼ O(n 9 ). For instance, Γ 180 has 79,538,751 isomers. The Fries and Clar numbers are stability predictors of a Fullerene molecule. These number can be computed by solving a (possibly N P -hard) combinatorial optimization problem. We propose several ILP formulation of such a problem each yielding a solution algorithm that provides the exact value of the Fries and Clar numbers. We compare the performances of the algorithm derived from the proposed ILP formulations. One of this algorithm is used to find the Clar isomers, i.e., those for which the Clar number is maximum among all isomers having a given size. We repeated this computational experiment for all sizes up to 204 atoms. In the course of the study a total of 2 649 413 774 isomers were analyzed.The second essay concerns developing an iterative primal dual infeasible path following (PDIPF) interior point (IP) algorithm for separable convex quadratic minimum cost flow network problem. In each iteration of PDIPF algorithm, the main computational effort is solving the underlying Newton search direction system. We concentrated on finding the solution of the corresponding linear system iteratively and inexactly. We assumed that all the involved inequalities can be solved inexactly and to this purpose, we focused on different approaches for distributing the error generated by iterative linear solvers such that the convergences of the PDIPF algorithm are guaranteed. As a result, we achieved theoretical bases that open the path to further interesting practical investiga- tion

Archivio della ricerca- Università di Roma La Sapienza

Impact of noise on inverse design: The case of NMR spectra matching

Author: Lemm Dominik
von Lilienfeld Otto Anatole
von Rudorff Guido Falk
Publication venue
Publication date: 08/07/2023
Field of study

Despite its fundamental importance and widespread use for assessing reaction success in organic chemistry, deducing chemical structures from nuclear magnetic resonance (NMR) measurements has remained largely manual and time consuming. To keep up with the accelerated pace of automated synthesis in self driving laboratory settings, robust computational algorithms are needed to rapidly perform structure elucidations. We analyse the effectiveness of solving the NMR spectra matching task encountered in this inverse structure elucidation problem by systematically constraining the chemical search space, and correspondingly reducing the ambiguity of the matching task. Numerical evidence collected for the twenty most common stoichiometries in the QM9-NMR data base indicate systematic trends of more permissible machine learning prediction errors in constrained search spaces. Results suggest that compounds with multiple heteroatoms are harder to characterize than others. Extending QM9 by

\sim

10 times more constitutional isomers with 3D structures generated by Surge, ETKDG and CREST, we used ML models of chemical shifts trained on the QM9-NMR data to test the spectra matching algorithms. Combining both

^{13}\mathrm{C}

and

^{1}\mathrm{H}

shifts in the matching process suggests twice as permissible machine learning prediction errors than for matching based on

^{13}\mathrm{C}

shifts alone. Performance curves demonstrate that reducing ambiguity and search space can decrease machine learning training data needs by orders of magnitude

arXiv.org e-Print Archive

Software platform virtualization in chemistry research and university teaching

Author: A Saghatelian
C Border
C Steinbeck
D Bullard
D Butina
D Field
DJ Wild
DJ Wild
E Russo
EL Schymanski
G Pirok
J Gasteiger
Julie A Leary
K Skapinetz
M Hann
M Katajamaa
M Stockman
MR Marty
MS Molchanova
N Bishop
N Brown
O Spjuth
Oliver Fiehn
P Ertl
R Figueiredo
R Guha
RF Boisvert
RP Goldberg
S Wold
T Kind
TI Oprea
Tim Leamy
Tobias Kind
VV Ramkumar
W Colon
W Vogels
Publication venue: BioMed Central
Publication date: 01/11/2009
Field of study

Abstract Background Modern chemistry laboratories operate with a wide range of software applications under different operating systems, such as Windows, LINUX or Mac OS X. Instead of installing software on different computers it is possible to install those applications on a single computer using Virtual Machine software. Software platform virtualization allows a single guest operating system to execute multiple other operating systems on the same computer. We apply and discuss the use of virtual machines in chemistry research and teaching laboratories. Results Virtual machines are commonly used for cheminformatics software development and testing. Benchmarking multiple chemistry software packages we have confirmed that the computational speed penalty for using virtual machines is low and around 5% to 10%. Software virtualization in a teaching environment allows faster deployment and easy use of commercial and open source software in hands-on computer teaching labs. Conclusion Software virtualization in chemistry, mass spectrometry and cheminformatics is needed for software testing and development of software for different operating systems. In order to obtain maximum performance the virtualization software should be multi-core enabled and allow the use of multiprocessor configurations in the virtual machine environment. Server consolidation, by running multiple tasks and operating systems on a single physical machine, can lead to lower maintenance and hardware costs especially in small research labs. The use of virtual machines can prevent software virus infections and security breaches when used as a sandbox system for internet access and software testing. Complex software setups can be created with virtual machines and are easily deployed later to multiple computers for hands-on teaching classes. We discuss the popularity of bioinformatics compared to cheminformatics as well as the missing cheminformatics education at universities worldwide.</p

Crossref

Directory of Open Access Journals

PubMed Central