18 research outputs found

    Reversible Markov chain estimation using convex-concave programming

    Get PDF
    We present a convex-concave reformulation of the reversible Markov chain estimation problem and outline an efficient numerical scheme for the solution of the resulting problem based on a primal-dual interior point method for monotone variational inequalities. Extensions to situations in which information about the stationary vector is available can also be solved via the convex- concave reformulation. The method can be generalized and applied to the discrete transition matrix reweighting analysis method to perform inference from independent chains with specified couplings between the stationary probabilities. The proposed approach offers a significant speed-up compared to a fixed-point iteration for a number of relevant applications.Comment: 17pages, 2 figure

    Kinetic distance and kinetic maps from molecular dynamics simulation

    Get PDF
    Characterizing macromolecular kinetics from molecular dynamics (MD) simulations requires a distance metric that can distinguish slowly-interconverting states. Here we build upon diffusion map theory and define a kinetic distance for irreducible Markov processes that quantifies how slowly molecular conformations interconvert. The kinetic distance can be computed given a model that approximates the eigenvalues and eigenvectors (reaction coordinates) of the MD Markov operator. Here we employ the time-lagged independent component analysis (TICA). The TICA components can be scaled to provide a kinetic map in which the Euclidean distance corresponds to the kinetic distance. As a result, the question of how many TICA dimensions should be kept in a dimensionality reduction approach becomes obsolete, and one parameter less needs to be specified in the kinetic model construction. We demonstrate the approach using TICA and Markov state model (MSM) analyses for illustrative models, protein conformation dynamics in bovine pancreatic trypsin inhibitor and protein-inhibitor association in trypsin and benzamidine

    PyEMMA 2: A Software Package for Estimation, Validation, and Analysis of Markov Models

    Get PDF
    Markov (state) models (MSMs) and related models of molecular kinetics have recently received a surge of interest as they can systematically reconcile simulation data from either a few long or many short simulations and allow us to analyze the essential metastable structures, thermodynamics, and kinetics of the molecular system under investigation. However, the estimation, validation, and analysis of such models is far from trivial and involves sophisticated and often numerically sensitive methods. In this work we present the opensource Python package PyEMMA (http://pyemma.org) that provides accurate and efficient algorithms for kinetic model construction. PyEMMA can read all common molecular dynamics data formats, helps in the selection of input features, provides easy access to dimension reduction algorithms such as principal component analysis (PCA) and time-lagged independent component analysis (TICA) and clustering algorithms such as k-means, and contains estimators for MSMs, hidden Markov models, and several other models. Systematic model validation and error calculation methods are provided. PyEMMA offers a wealth of analysis functions such that the user can conveniently compute molecular observables of interest. We have derived a systematic and accurate way to coarse-grain MSMs to few states and to illustrate the structures of the metastable states of the system. Plotting functions to produce a manuscript-ready presentation of the results are available. In this work, we demonstrate the features of the software and show new methodological concepts and results produced by PyEMMA

    Machine Learning for Molecular Dynamics on Long Timescales

    No full text
    Molecular dynamics (MD) simulation is widely used to analyze the properties of molecules and materials. Most practical applications, such as comparison with experimental measurements, designing drug molecules, or optimizing materials, rely on statistical quantities, which may be prohibitively expensive to compute from direct long-time MD simulations. Classical machine learning (ML) techniques have already had a profound impact on the field, especially for learning low-dimensional models of the long-time dynamics and for devising more efficient sampling schemes for computing long-time statistics. Novel ML methods have the potential to revolutionize long timescale MD and to obtain interpretable models. ML concepts such as statistical estimator theory, end-to-end learning, representation learning, and active learning are highly interesting for the MD researcher and will help to develop new solutions to hard MD problems. With the aim of better connecting the MD and ML research areas and spawning new research on this interface, we define the learning problems in long timescale MD, present successful approaches, and outline some of the unsolved ML problems in this application field
    corecore