Search CORE

52,717 research outputs found

First-principles molecular structure search with a genetic algorithm

Author: Baldauf Carsten
Blum Volker
Supady Adriana
Publication venue: 'American Chemical Society (ACS)'
Publication date: 13/10/2015
Field of study

The identification of low-energy conformers for a given molecule is a fundamental problem in computational chemistry and cheminformatics. We assess here a conformer search that employs a genetic algorithm for sampling the low-energy segment of the conformation space of molecules. The algorithm is designed to work with first-principles methods, facilitated by the incorporation of local optimization and blacklisting conformers to prevent repeated evaluations of very similar solutions. The aim of the search is not only to find the global minimum, but to predict all conformers within an energy window above the global minimum. The performance of the search strategy is: (i) evaluated for a reference data set extracted from a database with amino acid dipeptide conformers obtained by an extensive combined force field and first-principles search and (ii) compared to the performance of a systematic search and a random conformer generator for the example of a drug-like ligand with 43 atoms, 8 rotatable bonds and 1 cis/trans bond

arXiv.org e-Print Archive

MPG.PuRe

FigShare

Adaptive Genetic Algorithm for Crystal Structure Prediction

Author: Ho K. M.
Ji M.
Nguyen M. C.
Umemoto K.
Wang C. Z.
Wentzcovitch R. M.
Wu S. Q.
Zhao X.
吴顺情
Publication venue: 'IOP Publishing'
Publication date: 01/11/2013
Field of study

We present a genetic algorithm (GA) for structural search that combines the speed of structure exploration by classical potentials with the accuracy of density functional theory (DFT) calculations in an adaptive and iterative way. This strategy increases the efficiency of the DFT-based GA by several orders of magnitude. This gain allows considerable increase in size and complexity of systems that can be studied by first principles. The method's performance is illustrated by successful structure identifications of complex binary and ternary inter-metallic compounds with 36 and 54 atoms per cell, respectively. The discovery of a multi-TPa Mg-silicate phase with unit cell containing up to 56 atoms is also reported. Such phase is likely to be an essential component of terrestrial exoplanetary mantles.Comment: 14 pages, 4 figure

arXiv.org e-Print Archive

Xiamen University Institutional Repository

Fast, accurate, and transferable many-body interatomic potentials by symbolic regression

Author: Balasubramanian Adarsh
Hernandez Alberto
Mason Simon
Mueller Tim
Yuan Fenglin
Publication venue
Publication date: 17/08/2019
Field of study

The length and time scales of atomistic simulations are limited by the computational cost of the methods used to predict material properties. In recent years there has been great progress in the use of machine learning algorithms to develop fast and accurate interatomic potential models, but it remains a challenge to develop models that generalize well and are fast enough to be used at extreme time and length scales. To address this challenge, we have developed a machine learning algorithm based on symbolic regression in the form of genetic programming that is capable of discovering accurate, computationally efficient manybody potential models. The key to our approach is to explore a hypothesis space of models based on fundamental physical principles and select models within this hypothesis space based on their accuracy, speed, and simplicity. The focus on simplicity reduces the risk of overfitting the training data and increases the chances of discovering a model that generalizes well. Our algorithm was validated by rediscovering an exact Lennard-Jones potential and a Sutton Chen embedded atom method potential from training data generated using these models. By using training data generated from density functional theory calculations, we found potential models for elemental copper that are simple, as fast as embedded atom models, and capable of accurately predicting properties outside of their training set. Our approach requires relatively small sets of training data, making it possible to generate training data using highly accurate methods at a reasonable computational cost. We present our approach, the forms of the discovered models, and assessments of their transferability, accuracy and speed

arXiv.org e-Print Archive

Combining Bayesian Approaches and Evolutionary Techniques for the Inference of Breast Cancer Networks

Author: Beretta Stefano
Castelli Mauro
Goncalves Ivo
Merelli Ivan
Ramazzotti Daniele
Publication venue: 'Scitepress'
Publication date: 01/01/2016
Field of study

Gene and protein networks are very important to model complex large-scale systems in molecular biology. Inferring or reverseengineering such networks can be defined as the process of identifying gene/protein interactions from experimental data through computational analysis. However, this task is typically complicated by the enormously large scale of the unknowns in a rather small sample size. Furthermore, when the goal is to study causal relationships within the network, tools capable of overcoming the limitations of correlation networks are required. In this work, we make use of Bayesian Graphical Models to attach this problem and, specifically, we perform a comparative study of different state-of-the-art heuristics, analyzing their performance in inferring the structure of the Bayesian Network from breast cancer data

arXiv.org e-Print Archive

Crossref

Repositório da Universidade Nova de Lisboa

Chemoinformatics Research at the University of Sheffield: A History and Citation Analysis

Author: Bishop N.
Gillet V.J.
Holliday J.D.
Willett P.
Publication venue: 'SAGE Publications'
Publication date: 01/07/2003
Field of study

This paper reviews the work of the Chemoinformatics Research Group in the Department of Information Studies at the University of Sheffield, focusing particularly on the work carried out in the period 1985-2002. Four major research areas are discussed, these involving the development of methods for: substructure searching in databases of three-dimensional structures, including both rigid and flexible molecules; the representation and searching of the Markush structures that occur in chemical patents; similarity searching in databases of both two-dimensional and three-dimensional structures; and compound selection and the design of combinatorial libraries. An analysis of citations to 321 publications from the Group shows that it attracted a total of 3725 residual citations during the period 1980-2002. These citations appeared in 411 different journals, and involved 910 different citing organizations from 54 different countries, thus demonstrating the widespread impact of the Group's work

Crossref

White Rose Research Online

Towards Understanding the Origin of Genetic Languages

Author: Patel Apoorva D.
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 28/10/2008
Field of study

Molecular biology is a nanotechnology that works--it has worked for billions of years and in an amazing variety of circumstances. At its core is a system for acquiring, processing and communicating information that is universal, from viruses and bacteria to human beings. Advances in genetics and experience in designing computers have taken us to a stage where we can understand the optimisation principles at the root of this system, from the availability of basic building blocks to the execution of tasks. The languages of DNA and proteins are argued to be the optimal solutions to the information processing tasks they carry out. The analysis also suggests simpler predecessors to these languages, and provides fascinating clues about their origin. Obviously, a comprehensive unraveling of the puzzle of life would have a lot to say about what we may design or convert ourselves into.Comment: (v1) 33 pages, contributed chapter to "Quantum Aspects of Life", edited by D. Abbott, P. Davies and A. Pati, (v2) published version with some editin

arXiv.org e-Print Archive

Crossref