18 research outputs found
Reversible Markov chain estimation using convex-concave programming
We present a convex-concave reformulation of the reversible Markov chain
estimation problem and outline an efficient numerical scheme for the solution
of the resulting problem based on a primal-dual interior point method for
monotone variational inequalities. Extensions to situations in which
information about the stationary vector is available can also be solved via the
convex- concave reformulation. The method can be generalized and applied to the
discrete transition matrix reweighting analysis method to perform inference
from independent chains with specified couplings between the stationary
probabilities. The proposed approach offers a significant speed-up compared to
a fixed-point iteration for a number of relevant applications.Comment: 17pages, 2 figure
Kinetic distance and kinetic maps from molecular dynamics simulation
Characterizing macromolecular kinetics from molecular dynamics (MD)
simulations requires a distance metric that can distinguish
slowly-interconverting states. Here we build upon diffusion map theory and
define a kinetic distance for irreducible Markov processes that quantifies how
slowly molecular conformations interconvert. The kinetic distance can be
computed given a model that approximates the eigenvalues and eigenvectors
(reaction coordinates) of the MD Markov operator. Here we employ the
time-lagged independent component analysis (TICA). The TICA components can be
scaled to provide a kinetic map in which the Euclidean distance corresponds to
the kinetic distance. As a result, the question of how many TICA dimensions
should be kept in a dimensionality reduction approach becomes obsolete, and one
parameter less needs to be specified in the kinetic model construction. We
demonstrate the approach using TICA and Markov state model (MSM) analyses for
illustrative models, protein conformation dynamics in bovine pancreatic trypsin
inhibitor and protein-inhibitor association in trypsin and benzamidine
PyEMMA 2: A Software Package for Estimation, Validation, and Analysis of Markov Models
Markov (state) models (MSMs) and related models of molecular kinetics have recently received a surge of interest as they can systematically reconcile simulation data from either a few long or many short simulations and allow us to analyze the essential metastable structures, thermodynamics, and kinetics of the molecular system under investigation. However, the estimation, validation, and analysis of such models is far from trivial and involves sophisticated and often numerically sensitive methods. In this work we present the opensource Python package PyEMMA (http://pyemma.org) that provides accurate and efficient algorithms for kinetic model construction. PyEMMA can read all common molecular dynamics data formats, helps in the selection of input features, provides easy access to dimension reduction algorithms such as principal component analysis (PCA) and time-lagged independent component analysis (TICA) and clustering algorithms such as k-means, and contains estimators for MSMs, hidden Markov models, and several other models. Systematic model validation and error calculation methods are provided. PyEMMA offers a wealth of analysis functions such that the user can conveniently compute molecular observables of interest. We have derived a systematic and accurate way to coarse-grain MSMs to few states and to illustrate the structures of the metastable states of the system. Plotting functions to produce a manuscript-ready presentation of the results are available. In this work, we demonstrate the features of the software and show new methodological concepts and results produced by PyEMMA
Machine Learning for Molecular Dynamics on Long Timescales
Molecular dynamics (MD) simulation is widely used to analyze the properties of molecules and materials. Most practical applications, such as comparison with experimental measurements, designing drug molecules, or optimizing materials, rely on statistical quantities, which may be prohibitively expensive to compute from direct long-time MD simulations. Classical machine learning (ML) techniques have already had a profound impact on the field, especially for learning low-dimensional models of the long-time dynamics and for devising more efficient sampling schemes for computing long-time statistics. Novel ML methods have the potential to revolutionize long timescale MD and to obtain interpretable models. ML concepts such as statistical estimator theory, end-to-end learning, representation learning, and active learning are highly interesting for the MD researcher and will help to develop new solutions to hard MD problems. With the aim of better connecting the MD and ML research areas and spawning new research on this interface, we define the learning problems in long timescale MD, present successful approaches, and outline some of the unsolved ML problems in this application field