368 research outputs found
Unbiased simulation of structural transitions in calmodulin
We introduce an approach for performing "very long" computer simulations of
the dynamics of simplified, folded proteins. Using an alpha-carbon protein
model and a fine grid to mimic continuum computations at increased speed, we
perform unbiased simulations which exhibit many large-scale conformational
transitions at low cost. In the case of the 72-residue N-terminal domain of
calmodulin, the approach yields structural transitions between the calcium-free
and calcium-bound structures at a rate of roughly one per day on a single Intel
processor. Stable intermediates can be clearly characterized. The model employs
Go-like interactions to stabilize two (or more) experimentally-determined
structures. The approach is trivially parallelizable and readily generalizes to
more complex potentials at minimal cost
Equilibrium Sampling in Biomolecular Simulation
Equilibrium sampling of biomolecules remains an unmet challenge after more
than 30 years of atomistic simulation. Efforts to enhance sampling capability,
which are reviewed here, range from the development of new algorithms to
parallelization to novel uses of hardware. Special focus is placed on
classifying algorithms -- most of which are underpinned by a few key ideas --
in order to understand their fundamental strengths and limitations. Although
algorithms have proliferated, progress resulting from novel hardware use
appears to be more clear-cut than from algorithms alone, partly due to the lack
of widely used sampling measures.Comment: submitted to Annual Review of Biophysic
Heterogeneous path ensembles for conformational transitions in semi-atomistic models of adenylate kinase
We performed "weighted ensemble" path-sampling simulations of adenylate
kinase, using several semi-atomistic protein models. Our study investigated
both the biophysics of conformational transitions as well as the possibility of
increasing model accuracy without sacrificing good sampling. Biophysically, the
path ensembles show significant heterogeneity and the explicit possibility of
two principle pathways in the Open-Closed transition. We recently showed, under
certain conditions, a "symmetry of hetereogeneity" is expected between the
forward and the reverse transitions: the fraction of transitions taking a
specific pathway/channel will be the same in both the directions. Our path
ensembles are analyzed in the light of the symmetry relation and its
conditions. In the realm of modeling, we employed an all-atom backbone with
various levels of residue interactions. Because reasonable path sampling
required only a few weeks of single-processor computing time with these models,
the addition of further chemical detail should be feasible
Pathway Histogram Analysis of Trajectories: A general strategy for quantification of molecular mechanisms
A key overall goal of biomolecular simulations is the characterization of
"mechanism" -- the pathways through configuration space of processes such as
conformational transitions and binding. Some amount of heterogeneity is
intrinsic to the ensemble of pathways, in direct analogy to thermal
configurational ensembles. Quantification of that heterogeneity is essential to
a complete understanding of mechanism. We propose a general approach for
characterizing path ensembles based on mapping individual trajectories into
pathway classes whose populations and uncertainties can be analyzed as an
ordinary histogram, providing a quantitative "fingerprint" of mechanism. In
contrast to prior flux-based analyses used for discrete-state models,
stochastic deviations from average behavior are explicitly included via direct
classification of trajectories. The histogram approach, furthermore, is
applicable to analysis of continuous trajectories. It enables straightforward
comparison between ensembles produced by different methods or under different
conditions. To implement the formulation, we develop approaches for classifying
trajectories, including a clustering-based approach suitable for both
continuous-space (e.g., molecular dynamics) or discrete-state (e.g., Markov
state model) trajectories, as well as a "fundamental sequence" approach
tailored for discrete-state trajectories but also applicable to continuous
trajectories through a mapping process. We apply the pathway histogram analysis
to a toy model and an extremely long atomistic molecular dynamics trajectory of
protein folding
Resolution exchange simulation with incremental coarsening
We previously developed an algorithm, called resolution exchange, which
improves canonical sampling of atomic resolution models by swapping
conformations between high- and low-resolution simulations[1]. Here, we
demonstrate a generally applicable incremental coarsening procedure and apply
the algorithm to a larger peptide, met-enkephalin. In addition, we demonstrate
a combination of resolution and temperature exchange, in which the coarser
simulations are also at elevated temperatures. Both simulations are implemented
in a ``top-down'' mode, to allow efficient allocation of CPU time among the
different replicas
Statistical uncertainty analysis for small-sample, high log-variance data: Cautions for bootstrapping and Bayesian bootstrapping
Recent advances in molecular simulations allow the evaluation of previously
unattainable observables, such as rate constants for protein folding. However,
these calculations are usually computationally expensive and even significant
computing resources may result in a small number of independent estimates
spread over many orders of magnitude. Such small-sample, high "log-variance"
data are not readily amenable to analysis using the standard uncertainty (i.e.,
"standard error of the mean") because unphysical negative limits of confidence
intervals result. Bootstrapping, a natural alternative guaranteed to yield a
confidence interval within the minimum and maximum values, also exhibits a
striking systematic bias of the lower confidence limit in log space. As we
show, bootstrapping artifactually assigns high probability to improbably low
mean values. A second alternative, the Bayesian bootstrap strategy, does not
suffer from the same deficit and is more logically consistent with the type of
confidence interval desired. The Bayesian bootstrap provides uncertainty
intervals that are more reliable than those from the standard bootstrap method,
but must be used with caution nevertheless. Neither standard nor Bayesian
bootstrapping can overcome the intrinsic challenge of under-estimating the mean
from small-size, high log-variance samples. Our conclusions are based on
extensive analysis of model distributions and re-analysis of multiple
independent atomistic simulations. Although we only analyze rate constants,
similar considerations will apply to related calculations, potentially
including highly non-linear averages like the Jarzynski relation.Comment: Added whole new section on the analysis of continuous distribution
The structural de-correlation time: A robust statistical measure of convergence of biomolecular simulations
Although atomistic simulations of proteins and other biological systems are
approaching microsecond timescales, the quality of trajectories has remained
difficult to assess. Such assessment is critical not only for establishing the
relevance of any individual simulation but also in the extremely active field
of developing computational methods. Here we map the trajectory assessment
problem onto a simple statistical calculation of the ``effective sample size''
- i.e., the number of statistically independent configurations. The mapping is
achieved by asking the question, ``How much time must elapse between snapshots
included in a sample for that sample to exhibit the statistical properties
expected for independent and identically distributed configurations?'' The
resulting ``structural de-correlation time'' is robustly calculated using exact
properties deduced from our previously developed ``structural histograms,''
without any fitting parameters. We show the method is equally and directly
applicable to toy models, peptides, and a 72-residue protein model. Variants of
our approach can readily be applied to a wide range of physical and chemical
systems
Optimizing weighted ensemble sampling of steady states
We propose parameter optimization techniques for weighted ensemble sampling
of Markov chains in the steady-state regime. Weighted ensemble consists of
replicas of a Markov chain, each carrying a weight, that are periodically
resampled according to their weights inside of each of a number of bins that
partition state space. We derive, from first principles, strategies for
optimizing the choices of weighted ensemble parameters, in particular the
choice of bins and the number of replicas to maintain in each bin. In a simple
numerical example, we compare our new strategies with more traditional ones and
with direct Monte Carlo.Comment: 28 pages, 5 figure
A Second Look at Canonical Sampling of Biomolecules using Replica Exchange Simulation
Because of growing interest in temperature-based sampling methods like
replica exchange, this note aims to make some observations and raise some
potentially important questions which we have not seen addressed sufficiently
in the literature. Mainly, we wish to call attention to limits on the maximum
speed-up to be expected from temperature-based methods, and also note the need
for careful quantification of sampling efficiency. Because potentially lengthy
studies may be necessary to address these issues, we felt it would be useful to
bring them to the attention of the broader community. Here \emph{we are
strictly concerned with canonical sampling at a fixed temperature,} and
\emph{not} with conformational search
Efficient use of non-equilibrium measurement to estimate free energy differences for molecular systems
A promising method for calculating free energy differences Delta F is to
generate non-equilibrium data via ``fast-growth'' simulations or experiments --
and then use Jarzynski's equality. However, a difficulty with using Jarzynski's
equality is that Delta F estimates converge very slowly and unreliably due to
the nonlinear nature of the calculation -- thus requiring large, costly data
sets. Here, we present new analyses of non-equilibrium data from various
simulated molecular systems exploiting statistical properties of Jarzynski's
equality. Using a fully automated procedure, with no user-input parameters, our
results suggest that good estimates of Delta F can be obtained using 6-15 fold
less data than was previously possible. Systematizing and extending previous
work [1], the new results exploit the systematic behavior of bias due to finite
sample size. A key innovation is better use of the more statistically reliable
information available from the raw data.Comment: 11 pages, 5 figure
- …