368 research outputs found

    Unbiased simulation of structural transitions in calmodulin

    Full text link
    We introduce an approach for performing "very long" computer simulations of the dynamics of simplified, folded proteins. Using an alpha-carbon protein model and a fine grid to mimic continuum computations at increased speed, we perform unbiased simulations which exhibit many large-scale conformational transitions at low cost. In the case of the 72-residue N-terminal domain of calmodulin, the approach yields structural transitions between the calcium-free and calcium-bound structures at a rate of roughly one per day on a single Intel processor. Stable intermediates can be clearly characterized. The model employs Go-like interactions to stabilize two (or more) experimentally-determined structures. The approach is trivially parallelizable and readily generalizes to more complex potentials at minimal cost

    Equilibrium Sampling in Biomolecular Simulation

    Full text link
    Equilibrium sampling of biomolecules remains an unmet challenge after more than 30 years of atomistic simulation. Efforts to enhance sampling capability, which are reviewed here, range from the development of new algorithms to parallelization to novel uses of hardware. Special focus is placed on classifying algorithms -- most of which are underpinned by a few key ideas -- in order to understand their fundamental strengths and limitations. Although algorithms have proliferated, progress resulting from novel hardware use appears to be more clear-cut than from algorithms alone, partly due to the lack of widely used sampling measures.Comment: submitted to Annual Review of Biophysic

    Heterogeneous path ensembles for conformational transitions in semi-atomistic models of adenylate kinase

    Full text link
    We performed "weighted ensemble" path-sampling simulations of adenylate kinase, using several semi-atomistic protein models. Our study investigated both the biophysics of conformational transitions as well as the possibility of increasing model accuracy without sacrificing good sampling. Biophysically, the path ensembles show significant heterogeneity and the explicit possibility of two principle pathways in the Open-Closed transition. We recently showed, under certain conditions, a "symmetry of hetereogeneity" is expected between the forward and the reverse transitions: the fraction of transitions taking a specific pathway/channel will be the same in both the directions. Our path ensembles are analyzed in the light of the symmetry relation and its conditions. In the realm of modeling, we employed an all-atom backbone with various levels of residue interactions. Because reasonable path sampling required only a few weeks of single-processor computing time with these models, the addition of further chemical detail should be feasible

    Pathway Histogram Analysis of Trajectories: A general strategy for quantification of molecular mechanisms

    Full text link
    A key overall goal of biomolecular simulations is the characterization of "mechanism" -- the pathways through configuration space of processes such as conformational transitions and binding. Some amount of heterogeneity is intrinsic to the ensemble of pathways, in direct analogy to thermal configurational ensembles. Quantification of that heterogeneity is essential to a complete understanding of mechanism. We propose a general approach for characterizing path ensembles based on mapping individual trajectories into pathway classes whose populations and uncertainties can be analyzed as an ordinary histogram, providing a quantitative "fingerprint" of mechanism. In contrast to prior flux-based analyses used for discrete-state models, stochastic deviations from average behavior are explicitly included via direct classification of trajectories. The histogram approach, furthermore, is applicable to analysis of continuous trajectories. It enables straightforward comparison between ensembles produced by different methods or under different conditions. To implement the formulation, we develop approaches for classifying trajectories, including a clustering-based approach suitable for both continuous-space (e.g., molecular dynamics) or discrete-state (e.g., Markov state model) trajectories, as well as a "fundamental sequence" approach tailored for discrete-state trajectories but also applicable to continuous trajectories through a mapping process. We apply the pathway histogram analysis to a toy model and an extremely long atomistic molecular dynamics trajectory of protein folding

    Resolution exchange simulation with incremental coarsening

    Full text link
    We previously developed an algorithm, called resolution exchange, which improves canonical sampling of atomic resolution models by swapping conformations between high- and low-resolution simulations[1]. Here, we demonstrate a generally applicable incremental coarsening procedure and apply the algorithm to a larger peptide, met-enkephalin. In addition, we demonstrate a combination of resolution and temperature exchange, in which the coarser simulations are also at elevated temperatures. Both simulations are implemented in a ``top-down'' mode, to allow efficient allocation of CPU time among the different replicas

    Statistical uncertainty analysis for small-sample, high log-variance data: Cautions for bootstrapping and Bayesian bootstrapping

    Full text link
    Recent advances in molecular simulations allow the evaluation of previously unattainable observables, such as rate constants for protein folding. However, these calculations are usually computationally expensive and even significant computing resources may result in a small number of independent estimates spread over many orders of magnitude. Such small-sample, high "log-variance" data are not readily amenable to analysis using the standard uncertainty (i.e., "standard error of the mean") because unphysical negative limits of confidence intervals result. Bootstrapping, a natural alternative guaranteed to yield a confidence interval within the minimum and maximum values, also exhibits a striking systematic bias of the lower confidence limit in log space. As we show, bootstrapping artifactually assigns high probability to improbably low mean values. A second alternative, the Bayesian bootstrap strategy, does not suffer from the same deficit and is more logically consistent with the type of confidence interval desired. The Bayesian bootstrap provides uncertainty intervals that are more reliable than those from the standard bootstrap method, but must be used with caution nevertheless. Neither standard nor Bayesian bootstrapping can overcome the intrinsic challenge of under-estimating the mean from small-size, high log-variance samples. Our conclusions are based on extensive analysis of model distributions and re-analysis of multiple independent atomistic simulations. Although we only analyze rate constants, similar considerations will apply to related calculations, potentially including highly non-linear averages like the Jarzynski relation.Comment: Added whole new section on the analysis of continuous distribution

    The structural de-correlation time: A robust statistical measure of convergence of biomolecular simulations

    Full text link
    Although atomistic simulations of proteins and other biological systems are approaching microsecond timescales, the quality of trajectories has remained difficult to assess. Such assessment is critical not only for establishing the relevance of any individual simulation but also in the extremely active field of developing computational methods. Here we map the trajectory assessment problem onto a simple statistical calculation of the ``effective sample size'' - i.e., the number of statistically independent configurations. The mapping is achieved by asking the question, ``How much time must elapse between snapshots included in a sample for that sample to exhibit the statistical properties expected for independent and identically distributed configurations?'' The resulting ``structural de-correlation time'' is robustly calculated using exact properties deduced from our previously developed ``structural histograms,'' without any fitting parameters. We show the method is equally and directly applicable to toy models, peptides, and a 72-residue protein model. Variants of our approach can readily be applied to a wide range of physical and chemical systems

    Optimizing weighted ensemble sampling of steady states

    Full text link
    We propose parameter optimization techniques for weighted ensemble sampling of Markov chains in the steady-state regime. Weighted ensemble consists of replicas of a Markov chain, each carrying a weight, that are periodically resampled according to their weights inside of each of a number of bins that partition state space. We derive, from first principles, strategies for optimizing the choices of weighted ensemble parameters, in particular the choice of bins and the number of replicas to maintain in each bin. In a simple numerical example, we compare our new strategies with more traditional ones and with direct Monte Carlo.Comment: 28 pages, 5 figure

    A Second Look at Canonical Sampling of Biomolecules using Replica Exchange Simulation

    Full text link
    Because of growing interest in temperature-based sampling methods like replica exchange, this note aims to make some observations and raise some potentially important questions which we have not seen addressed sufficiently in the literature. Mainly, we wish to call attention to limits on the maximum speed-up to be expected from temperature-based methods, and also note the need for careful quantification of sampling efficiency. Because potentially lengthy studies may be necessary to address these issues, we felt it would be useful to bring them to the attention of the broader community. Here \emph{we are strictly concerned with canonical sampling at a fixed temperature,} and \emph{not} with conformational search

    Efficient use of non-equilibrium measurement to estimate free energy differences for molecular systems

    Full text link
    A promising method for calculating free energy differences Delta F is to generate non-equilibrium data via ``fast-growth'' simulations or experiments -- and then use Jarzynski's equality. However, a difficulty with using Jarzynski's equality is that Delta F estimates converge very slowly and unreliably due to the nonlinear nature of the calculation -- thus requiring large, costly data sets. Here, we present new analyses of non-equilibrium data from various simulated molecular systems exploiting statistical properties of Jarzynski's equality. Using a fully automated procedure, with no user-input parameters, our results suggest that good estimates of Delta F can be obtained using 6-15 fold less data than was previously possible. Systematizing and extending previous work [1], the new results exploit the systematic behavior of bias due to finite sample size. A key innovation is better use of the more statistically reliable information available from the raw data.Comment: 11 pages, 5 figure
    • …
    corecore