453 research outputs found
Interplay between Secondary and Tertiary Structure Formation in Protein Folding Cooperativity
Protein folding cooperativity is defined by the nature of the finite-size
thermodynamic transition exhibited upon folding: two-state transitions show a
free energy barrier between the folded and unfolded ensembles, while downhill
folding is barrierless. A microcanonical analysis, where the energy is the
natural variable, has shown better suited to unambiguously characterize the
nature of the transition compared to its canonical counterpart. Replica
exchange molecular dynamics simulations of a high resolution coarse-grained
model allow for the accurate evaluation of the density of states, in order to
extract precise thermodynamic information, and measure its impact on structural
features. The method is applied to three helical peptides: a short helix shows
sharp features of a two-state folder, while a longer helix and a three-helix
bundle exhibit downhill and two-state transitions, respectively. Extending the
results of lattice simulations and theoretical models, we find that it is the
interplay between secondary structure and the loss of non-native tertiary
contacts which determines the nature of the transition.Comment: 3 pages, 3 figure
Efficient potential of mean force calculation from multiscale simulations: solute insertion in a lipid membrane
The determination of potentials of mean force for solute insertion in a
membrane by means of all-atom molecular dynamics simulations is often hampered
by sampling issues. A multiscale approach to conformational sampling was
recently proposed by Bereau and Kremer (2016). It aims at accelerating the
sampling of the atomistic conformational space by means of a systematic
backmapping of coarse-grained snapshots. In this work, we first analyze the
efficiency of this method by comparing its predictions for propanol insertion
into a 1,2-Dimyristoyl-sn-glycero-3-phosphocholine membrane (DMPC) against
reference atomistic simulations. The method is found to provide accurate
results with a gain of one order of magnitude in computational time. We then
investigate the role of the coarse-grained representation in affecting the
reliability of the method in the case of a
1,2-Dioleoyl-sn-glycero-3-phosphocholine membrane (DOPC). We find that the
accuracy of the results is tightly connected to the presence a good
configurational overlap between the coarse-grained and atomistic models---a
general requirement when developing multiscale simulation methods.Comment: 6 pages, 5 figure
Transferable atomic multipole machine learning models for small organic molecules
Accurate representation of the molecular electrostatic potential, which is
often expanded in distributed multipole moments, is crucial for an efficient
evaluation of intermolecular interactions. Here we introduce a machine learning
model for multipole coefficients of atom types H, C, O, N, S, F, and Cl in any
molecular conformation. The model is trained on quantum chemical results for
atoms in varying chemical environments drawn from thousands of organic
molecules. Multipoles in systems with neutral, cationic, and anionic molecular
charge states are treated with individual models. The models' predictive
accuracy and applicability are illustrated by evaluating intermolecular
interaction energies of nearly 1,000 dimers and the cohesive energy of the
benzene crystal.Comment: 11 pages, 6 figure
Controlled exploration of chemical space by machine learning of coarse-grained representations
The size of chemical compound space is too large to be probed exhaustively.
This leads high-throughput protocols to drastically subsample and results in
sparse and non-uniform datasets. Rather than arbitrarily selecting compounds,
we systematically explore chemical space according to the target property of
interest. We first perform importance sampling by introducing a Markov chain
Monte Carlo scheme across compounds. We then train an ML model on the sampled
data to expand the region of chemical space probed. Our boosting procedure
enhances the number of compounds by a factor 2 to 10, enabled by the ML model's
coarse-grained representation, which both simplifies the structure-property
relationship and reduces the size of chemical space. The ML model correctly
recovers linear relationships between transfer free energies. These linear
relationships correspond to features that are global to the dataset, marking
the region of chemical space up to which predictions are reliable---a more
robust alternative to the predictive variance. Bridging coarse-grained
simulations with ML gives rise to an unprecedented database of drug-membrane
insertion free energies for 1.3 million compounds.Comment: 9 pages, 5 figure
In silico screening of drug-membrane thermodynamics reveals linear relations between bulk partitioning and the potential of mean force
The partitioning of small molecules in cell membranes---a key parameter for
pharmaceutical applications---typically relies on experimentally-available bulk
partitioning coefficients. Computer simulations provide a structural resolution
of the insertion thermodynamics via the potential of mean force, but require
significant sampling at the atomistic level. Here, we introduce high-throughput
coarse-grained molecular dynamics simulations to screen thermodynamic
properties. This application of physics based models in a large-scale study of
small molecules establishes linear relationships between partitioning
coefficients and key features of the potential of mean force. This allows us to
predict the structure of the insertion from bulk experimental measurements for
more than 400,000 compounds. The potential of mean force hereby becomes an
easily accessible quantity---already recognized for its high predictability of
certain properties, e.g., passive permeation. Further, we demonstrate how
coarse graining helps reduce the size of chemical space, enabling a
hierarchical approach to screening small molecules.Comment: 8 pages, 6 figures. Typos fixed, minor correction
Hydration free energies from kernel-based machine learning: Compound-database bias
We consider the prediction of a basic thermodynamic property---hydration free
energies---across a large subset of the chemical space of small organic
molecules. Our in silico study is based on computer simulations at the
atomistic level with implicit solvent. We report on a kernel-based machine
learning approach that is inspired by recent work in learning electronic
properties, but differs in key aspects: The representation is averaged over
several conformers to account for the statistical ensemble. We also include an
atomic-decomposition ansatz, which we show offers significant added
transferability compared to molecular learning. Finally, we explore the
existence of severe biases from databases of experimental compounds. By
performing a combination of dimensionality reduction and cross-learning models,
we show that the rate of learning depends significantly on the breadth and
variety of the training dataset. Our study highlights the dangers of fitting
machine-learning models to databases of narrow chemical range.Comment: 10 pages, 7 figure
- …