448 research outputs found

    Interplay between Secondary and Tertiary Structure Formation in Protein Folding Cooperativity

    Get PDF
    Protein folding cooperativity is defined by the nature of the finite-size thermodynamic transition exhibited upon folding: two-state transitions show a free energy barrier between the folded and unfolded ensembles, while downhill folding is barrierless. A microcanonical analysis, where the energy is the natural variable, has shown better suited to unambiguously characterize the nature of the transition compared to its canonical counterpart. Replica exchange molecular dynamics simulations of a high resolution coarse-grained model allow for the accurate evaluation of the density of states, in order to extract precise thermodynamic information, and measure its impact on structural features. The method is applied to three helical peptides: a short helix shows sharp features of a two-state folder, while a longer helix and a three-helix bundle exhibit downhill and two-state transitions, respectively. Extending the results of lattice simulations and theoretical models, we find that it is the interplay between secondary structure and the loss of non-native tertiary contacts which determines the nature of the transition.Comment: 3 pages, 3 figure

    Efficient potential of mean force calculation from multiscale simulations: solute insertion in a lipid membrane

    Full text link
    The determination of potentials of mean force for solute insertion in a membrane by means of all-atom molecular dynamics simulations is often hampered by sampling issues. A multiscale approach to conformational sampling was recently proposed by Bereau and Kremer (2016). It aims at accelerating the sampling of the atomistic conformational space by means of a systematic backmapping of coarse-grained snapshots. In this work, we first analyze the efficiency of this method by comparing its predictions for propanol insertion into a 1,2-Dimyristoyl-sn-glycero-3-phosphocholine membrane (DMPC) against reference atomistic simulations. The method is found to provide accurate results with a gain of one order of magnitude in computational time. We then investigate the role of the coarse-grained representation in affecting the reliability of the method in the case of a 1,2-Dioleoyl-sn-glycero-3-phosphocholine membrane (DOPC). We find that the accuracy of the results is tightly connected to the presence a good configurational overlap between the coarse-grained and atomistic models---a general requirement when developing multiscale simulation methods.Comment: 6 pages, 5 figure

    Transferable atomic multipole machine learning models for small organic molecules

    Get PDF
    Accurate representation of the molecular electrostatic potential, which is often expanded in distributed multipole moments, is crucial for an efficient evaluation of intermolecular interactions. Here we introduce a machine learning model for multipole coefficients of atom types H, C, O, N, S, F, and Cl in any molecular conformation. The model is trained on quantum chemical results for atoms in varying chemical environments drawn from thousands of organic molecules. Multipoles in systems with neutral, cationic, and anionic molecular charge states are treated with individual models. The models' predictive accuracy and applicability are illustrated by evaluating intermolecular interaction energies of nearly 1,000 dimers and the cohesive energy of the benzene crystal.Comment: 11 pages, 6 figure

    Controlled exploration of chemical space by machine learning of coarse-grained representations

    Full text link
    The size of chemical compound space is too large to be probed exhaustively. This leads high-throughput protocols to drastically subsample and results in sparse and non-uniform datasets. Rather than arbitrarily selecting compounds, we systematically explore chemical space according to the target property of interest. We first perform importance sampling by introducing a Markov chain Monte Carlo scheme across compounds. We then train an ML model on the sampled data to expand the region of chemical space probed. Our boosting procedure enhances the number of compounds by a factor 2 to 10, enabled by the ML model's coarse-grained representation, which both simplifies the structure-property relationship and reduces the size of chemical space. The ML model correctly recovers linear relationships between transfer free energies. These linear relationships correspond to features that are global to the dataset, marking the region of chemical space up to which predictions are reliable---a more robust alternative to the predictive variance. Bridging coarse-grained simulations with ML gives rise to an unprecedented database of drug-membrane insertion free energies for 1.3 million compounds.Comment: 9 pages, 5 figure

    In silico screening of drug-membrane thermodynamics reveals linear relations between bulk partitioning and the potential of mean force

    Full text link
    The partitioning of small molecules in cell membranes---a key parameter for pharmaceutical applications---typically relies on experimentally-available bulk partitioning coefficients. Computer simulations provide a structural resolution of the insertion thermodynamics via the potential of mean force, but require significant sampling at the atomistic level. Here, we introduce high-throughput coarse-grained molecular dynamics simulations to screen thermodynamic properties. This application of physics based models in a large-scale study of small molecules establishes linear relationships between partitioning coefficients and key features of the potential of mean force. This allows us to predict the structure of the insertion from bulk experimental measurements for more than 400,000 compounds. The potential of mean force hereby becomes an easily accessible quantity---already recognized for its high predictability of certain properties, e.g., passive permeation. Further, we demonstrate how coarse graining helps reduce the size of chemical space, enabling a hierarchical approach to screening small molecules.Comment: 8 pages, 6 figures. Typos fixed, minor correction

    Hydration free energies from kernel-based machine learning: Compound-database bias

    Get PDF
    We consider the prediction of a basic thermodynamic property---hydration free energies---across a large subset of the chemical space of small organic molecules. Our in silico study is based on computer simulations at the atomistic level with implicit solvent. We report on a kernel-based machine learning approach that is inspired by recent work in learning electronic properties, but differs in key aspects: The representation is averaged over several conformers to account for the statistical ensemble. We also include an atomic-decomposition ansatz, which we show offers significant added transferability compared to molecular learning. Finally, we explore the existence of severe biases from databases of experimental compounds. By performing a combination of dimensionality reduction and cross-learning models, we show that the rate of learning depends significantly on the breadth and variety of the training dataset. Our study highlights the dangers of fitting machine-learning models to databases of narrow chemical range.Comment: 10 pages, 7 figure

    Reweighting non-equilibrium steady-state dynamics along collective variables

    Get PDF
    Computer simulations generate microscopic trajectories of complex systems at a single thermodynamic state point. We recently introduced a Maximum Caliber (MaxCal) approach for dynamical reweighting. Our approach mapped these trajectories to a Markovian description on the configurational coordinates, and reweighted path probabilities as a function of external forces. Trajectory probabilities can be dynamically reweighted both from and to equilibrium or non-equilibrium steady states. As the system's dimensionality increases, an exhaustive description of the microtrajectories becomes prohibitive--even with a Markovian assumption. Instead we reduce the dimensionality of the configurational space to collective variables (CVs). Going from configurational to CV space, we define local entropy productions derived from configurationally averaged mean forces. The entropy production is shown to be a suitable constraint on MaxCal for non-equilibrium steady states expressed as a function of CVs. We test the reweighting procedure on two systems: a particle subject to a two-dimensional potential and a coarse-grained peptide. Our CV-based MaxCal approach expands dynamical reweighting to larger systems, for both static and dynamical properties, and across a large range of driving forces.Comment: 12 pages, 7 figure
    • …
    corecore