12 research outputs found
Thermodynamically Optimized Machine-learned Reaction Coordinates for Hydrophobic Ligand Dissociation
Ligand unbinding is mediated by the free energy change, which has intertwined
contributions from both energy and entropy. It is important but not easy to
quantify their individual contributions. We model hydrophobic ligand unbinding
for two systems, a methane particle and a C60 fullerene, both unbinding from
hydrophobic pockets in all-atom water. By using a modified deep learning
framework, we learn a thermodynamically optimized reaction coordinate to
describe hydrophobic ligand dissociation for both systems. Interpretation of
these reaction coordinates reveals the roles of entropic and enthalpic forces
as ligand and pocket sizes change. Irrespective of the contrasting roles of
energy and entropy, we also find that for both the systems the transition from
the bound to unbound states is driven primarily by solvation of the pocket and
ligand, independent of ligand size. Our framework thus gives useful
thermodynamic insight into hydrophobic ligand dissociation problems that are
otherwise difficult to glean.Comment: 27 pages; 5 figure
Recent advances in describing and driving crystal nucleation using machine learning and artificial intelligence
With the advent of faster computer processors and especially graphics
processing units (GPUs) over the last few decades, the use of data-intensive
machine learning (ML) and artificial intelligence (AI) has increased greatly,
and the study of crystal nucleation has been one of the beneficiaries. In this
review, we outline how ML and AI have been applied to address four outstanding
difficulties of crystal nucleation: how to discover better reaction coordinates
(RCs) for describing accurately non-classical nucleation situations; the
development of more accurate force fields for describing the nucleation of
multiple polymorphs or phases for a single system; more robust identification
methods for determining crystal phases and structures; and as a method to yield
improved course-grained models for studying nucleation.Comment: 15 pages; 1 figur
Driving and characterizing nucleation of urea and glycine polymorphs in water
Crystal nucleation is relevant across the domains of fundamental and applied
sciences. However, in many cases its mechanism remains unclear due to a lack of
temporal or spatial resolution. To gain insights to the molecular details of
nucleation, some form of molecular dynamics simulations are typically
performed, which are, in turn, limited by their ability to run long enough to
sample the nucleation event thoroughly. To overcome the timescale limits in
typical molecular dynamics simulations in a manner free of prior human bias,
here we employ the machine learning augmented molecular dynamics framework
``Reweighted Autoencoded Variational Bayes for enhanced sampling (RAVE)". We
study the two molecular systems, urea and glycine in explicit all-atom water,
due to their enrichment in polymorphic structures and common utility in
commercial applications. From our simulations, we observe back-and-forth
liquid-solid transitions of different polymorphs, correctly ranking polymorph
stabilities as A- I- B-urea and - -
-glycine. Finally, the machine learning based reaction coordinates
allow for an in-depth analysis of the nucleation mechanism for both molecules,
providing clear evidence of nonclassical two-step nucleation for both urea and
glycine nucleation in water.Comment: 11 pages, 7 figure
Dinucleotides as simple models of the base stacking-unstacking component of DNA 'breathing' mechanisms
14 pagesRegulatory protein access to the DNA duplex 'interior' depends on local DNA 'breathing' fluctuations, and the most fundamental of these are thermally-driven base stacking-unstacking interactions. The smallest DNA unit that can undergo such transitions is the dinucleotide, whose structural and dynamic properties are dominated by stacking, while the ion condensation, cooperative stacking and inter-base hydrogen-bonding present in duplex DNA are not involved. We use dApdA to study stacking-unstacking at the dinucleotide level because the fluctuations observed are likely to resemble those of larger DNA molecules, but in the absence of constraints introduced by cooperativity are likely to be more pronounced, and thus more accessible to measurement. We study these fluctuations with a combination of Molecular Dynamics simulations on the microsecond timescale and Markov State Model analyses, and validate our results by calculations of circular dichroism (CD) spectra, with results that agree well with the experimental spectra. Our analyses show that the CD spectrum of dApdA is defined by two distinct chiral conformations that correspond, respectively, to a Watson-Crick form and a hybrid form with one base in a Hoogsteen configuration. We find also that ionic structure and water orientation around dApdA play important roles in controlling its breathing fluctuations.This research was supported by a grant from the National
Institute of Child Health and Human Development (5R01HD081
362-05) awarded to L.S. and N.B.A. The funding sources had no role
in the study design, data collection and analysis, or submission
process
Large Scale Benchmark of Materials Design Methods
Lack of rigorous reproducibility and validation are major hurdles for
scientific development across many fields. Materials science in particular
encompasses a variety of experimental and theoretical approaches that require
careful benchmarking. Leaderboard efforts have been developed previously to
mitigate these issues. However, a comprehensive comparison and benchmarking on
an integrated platform with multiple data modalities with both perfect and
defect materials data is still lacking. This work introduces
JARVIS-Leaderboard, an open-source and community-driven platform that
facilitates benchmarking and enhances reproducibility. The platform allows
users to set up benchmarks with custom tasks and enables contributions in the
form of dataset, code, and meta-data submissions. We cover the following
materials design categories: Artificial Intelligence (AI), Electronic Structure
(ES), Force-fields (FF), Quantum Computation (QC) and Experiments (EXP). For
AI, we cover several types of input data, including atomic structures,
atomistic images, spectra, and text. For ES, we consider multiple ES
approaches, software packages, pseudopotentials, materials, and properties,
comparing results to experiment. For FF, we compare multiple approaches for
material property predictions. For QC, we benchmark Hamiltonian simulations
using various quantum algorithms and circuits. Finally, for experiments, we use
the inter-laboratory approach to establish benchmarks. There are 1281
contributions to 274 benchmarks using 152 methods with more than 8 million
data-points, and the leaderboard is continuously expanding. The
JARVIS-Leaderboard is available at the website:
https://pages.nist.gov/jarvis_leaderboar
Extensions of the Langevin Equation for Protein Dynamics for Modelling Equilibrium Fluctuations of Proteins
Proteins are not static structures; they must undergo conformational fluctuations about their folded state to function. Typically, the slow, near-equilibrium
conformational dynamics of proteins encode the functional motions; an accurate
description of these dynamics is useful for elucidating the functional motions of
proteins. Use of molecular dynamics (MD) simulations gives a physical model
of proteins' motions, but the dynamics are too high dimensional and coupled to
determine the functional motions purely from observation of the MD trajectory;
thus, methods to effciently extract the slow conformational dynamics of proteins
from atomistic models are valuable.
This dissertation advances the Langevin equation for protein dynamics (LE4PD),
a diffusive, coarse-grained equation of motion for modeling protein dynamics adapted
from the field of polymer physics. The LE4PD is solved by an eigenvalue
decomposition into a set of normal mode coordinates, each of which encodes dynamics
on a specific time- and lengthscale. A discrete-state master equation approach,
Markov state modeling, is used to precisely determine the dynamics and kinetics
of conformational dynamics described by the slow LE4PD modes by analyzing a 1-
microsecond, folded simulation of the protein ubiquitin. The approach is able to
extract slow dynamics in important binding regions of ubiquitin. In chapter III,
Markov state models are used to determine the contributions of metastable states to
the circular dichroism spectrum of a dinucleotide system.
Because protein dynamics is inherently anisotropic, we develop an anisotropic
version of the LE4PD. When both hydrodynamic effects and free-energy barriers are
neglected, the model reduces to a principal component analysis of the alpha-carbon
coordinates; including both these effects are important for quantitatively modelling
the decay of simulated autocorrelation functions.
Finally, we compare the LE4PD predictions from the ubiquitin simulation to
the slow modes extracted by a time-lagged independent component analysis of the
trajectory. We nd both methods are able to extract the slow dynamics of the protein,
but the tICA compresses the information into a smaller number of modes; however,
for ubiquitin, the tICA modes cannot model the simulated autocorrelation functions
as effectively as the anisotropic LE4PD model.
This dissertation includes previously published and unpublished co-authored
material
Thermodynamically Optimized Machine-Learned Reaction Coordinates for Hydrophobic Ligand Dissociation
Ligand unbinding is mediated by its
free energy change, which has
intertwined contributions from both energy and entropy. It is important,
but not easy, to quantify their individual contributions to the free
energy profile. We model hydrophobic ligand unbinding for two systems,
a methane particle and a C60 fullerene, both unbinding
from hydrophobic pockets in all-atom water. Using a modified deep
learning framework, we learn a thermodynamically optimized reaction
coordinate to describe the hydrophobic ligand dissociation for both
systems. Interpretation of these reaction coordinates reveals the
roles of entropic and enthalpic forces as the ligand and pocket sizes
change. In both cases, we observe that the free-energy barrier to
unbinding is dominated by entropy considerations. Furthermore, the
process of methane unbinding is driven by methane solvation, while
fullerene unbinding is driven first by pocket wetting and then fullerene
wetting. For both solutes, the direct importance of the distance from
the binding pocket to the learned reaction coordinate is present,
but low. Our framework and subsequent feature important analysis thus
give useful thermodynamic insight into hydrophobic ligand dissociation
problems that are otherwise difficult to glean