4,045 research outputs found

    A motion planning approach to protein folding

    Get PDF
    Protein folding is considered to be one of the grand challenge problems in biology. Protein folding refers to how a protein's amino acid sequence, under certain physiological conditions, folds into a stable close-packed three-dimensional structure known as the native state. There are two major problems in protein folding. One, usually called protein structure prediction, is to predict the structure of the protein's native state given only the amino acid sequence. Another important and strongly related problem, often called protein folding, is to study how the amino acid sequence dynamically transitions from an unstructured state to the native state. In this dissertation, we concentrate on the second problem. There are several approaches that have been applied to the protein folding problem, including molecular dynamics, Monte Carlo methods, statistical mechanical models, and lattice models. However, most of these approaches suffer from either overly-detailed simulations, requiring impractical computation times, or overly-simplified models, resulting in unrealistic solutions. In this work, we present a novel motion planning based framework for studying protein folding. We describe how it can be used to approximately map a protein's energy landscape, and then discuss how to find approximate folding pathways and kinetics on this approximate energy landscape. In particular, our technique can produce potential energy landscapes, free energy landscapes, and many folding pathways all from a single roadmap. The roadmap can be computed in a few hours on a desktop PC using a coarse potential energy function. In addition, our motion planning based approach is the first simulation method that enables the study of protein folding kinetics at a level of detail that is appropriate (i.e., not too detailed or too coarse) for capturing possible 2-state and 3-state folding kinetics that may coexist in one protein. Indeed, the unique ability of our method to produce large sets of unrelated folding pathways may potentially provide crucial insight into some aspects of folding kinetics that are not available to other theoretical techniques

    Techniques for modeling and analyzing RNA and protein folding energy landscapes

    Get PDF
    RNA and protein molecules undergo a dynamic folding process that is important to their function. Computational methods are critical for studying this folding pro- cess because it is difficult to observe experimentally. In this work, we introduce new computational techniques to study RNA and protein energy landscapes, includ- ing a method to approximate an RNA energy landscape with a coarse graph (map) and new tools for analyzing graph-based approximations of RNA and protein energy landscapes. These analysis techniques can be used to study RNA and protein fold- ing kinetics such as population kinetics, folding rates, and the folding of particular subsequences. In particular, a map-based Master Equation (MME) method can be used to analyze the population kinetics of the maps, while another map analysis tool, map-based Monte Carlo (MMC) simulation, can extract stochastic folding pathways from the map. To validate the results, I compared our methods with other computational meth- ods and with experimental studies of RNA and protein. I first compared our MMC and MME methods for RNA with other computational methods working on the com- plete energy landscape and show that the approximate map captures the major fea- tures of a much larger (e.g., by orders of magnitude) complete energy landscape. Moreover, I show that the methods scale well to large molecules, e.g., RNA with 200+ nucleotides. Then, I correlate the computational results with experimental findings. I present comparisons with two experimental cases to show how I can pre- dict kinetics-based functional rates of ColE1 RNAII and MS2 phage RNA and their mutants using our MME and MMC tools respectively. I also show that the MME and MMC tools can be applied to map-based approximations of protein energy energy landscapes and present kinetics analysis results for several proteins

    Using motion planning to map protein folding landscapes and analyze folding kinetics of known native structures

    Full text link

    Major Subject: Computer ScienceTECHNIQUES FOR MODELING AND ANALYZING RNA AND PROTEIN FOLDING ENERGY LANDSCAPES

    Get PDF
    Major Subject: Computer Scienceiii Techniques for Modeling and Analyzing RNA and Protein Folding Energ

    The difficulty of folding self-folding origami

    Full text link
    Why is it difficult to refold a previously folded sheet of paper? We show that even crease patterns with only one designed folding motion inevitably contain an exponential number of `distractor' folding branches accessible from a bifurcation at the flat state. Consequently, refolding a sheet requires finding the ground state in a glassy energy landscape with an exponential number of other attractors of higher energy, much like in models of protein folding (Levinthal's paradox) and other NP-hard satisfiability (SAT) problems. As in these problems, we find that refolding a sheet requires actuation at multiple carefully chosen creases. We show that seeding successful folding in this way can be understood in terms of sub-patterns that fold when cut out (`folding islands'). Besides providing guidelines for the placement of active hinges in origami applications, our results point to fundamental limits on the programmability of energy landscapes in sheets.Comment: 8 pages, 5 figure

    Path Similarity Analysis: a Method for Quantifying Macromolecular Pathways

    Full text link
    Diverse classes of proteins function through large-scale conformational changes; sophisticated enhanced sampling methods have been proposed to generate these macromolecular transition paths. As such paths are curves in a high-dimensional space, they have been difficult to compare quantitatively, a prerequisite to, for instance, assess the quality of different sampling algorithms. The Path Similarity Analysis (PSA) approach alleviates these difficulties by utilizing the full information in 3N-dimensional trajectories in configuration space. PSA employs the Hausdorff or Fr\'echet path metrics---adopted from computational geometry---enabling us to quantify path (dis)similarity, while the new concept of a Hausdorff-pair map permits the extraction of atomic-scale determinants responsible for path differences. Combined with clustering techniques, PSA facilitates the comparison of many paths, including collections of transition ensembles. We use the closed-to-open transition of the enzyme adenylate kinase (AdK)---a commonly used testbed for the assessment enhanced sampling algorithms---to examine multiple microsecond equilibrium molecular dynamics (MD) transitions of AdK in its substrate-free form alongside transition ensembles from the MD-based dynamic importance sampling (DIMS-MD) and targeted MD (TMD) methods, and a geometrical targeting algorithm (FRODA). A Hausdorff pairs analysis of these ensembles revealed, for instance, that differences in DIMS-MD and FRODA paths were mediated by a set of conserved salt bridges whose charge-charge interactions are fully modeled in DIMS-MD but not in FRODA. We also demonstrate how existing trajectory analysis methods relying on pre-defined collective variables, such as native contacts or geometric quantities, can be used synergistically with PSA, as well as the application of PSA to more complex systems such as membrane transporter proteins.Comment: 9 figures, 3 tables in the main manuscript; supplementary information includes 7 texts (S1 Text - S7 Text) and 11 figures (S1 Fig - S11 Fig) (also available from journal site

    REinforcement learning based Adaptive samPling: REAPing Rewards by Exploring Protein Conformational Landscapes

    Full text link
    One of the key limitations of Molecular Dynamics simulations is the computational intractability of sampling protein conformational landscapes associated with either large system size or long timescales. To overcome this bottleneck, we present the REinforcement learning based Adaptive samPling (REAP) algorithm that aims to efficiently sample conformational space by learning the relative importance of each reaction coordinate as it samples the landscape. To achieve this, the algorithm uses concepts from the field of reinforcement learning, a subset of machine learning, which rewards sampling along important degrees of freedom and disregards others that do not facilitate exploration or exploitation. We demonstrate the effectiveness of REAP by comparing the sampling to long continuous MD simulations and least-counts adaptive sampling on two model landscapes (L-shaped and circular), and realistic systems such as alanine dipeptide and Src kinase. In all four systems, the REAP algorithm consistently demonstrates its ability to explore conformational space faster than the other two methods when comparing the expected values of the landscape discovered for a given amount of time. The key advantage of REAP is on-the-fly estimation of the importance of collective variables, which makes it particularly useful for systems with limited structural information
    • …
    corecore