20 research outputs found

    Techniques for modeling and analyzing RNA and protein folding energy landscapes

    Get PDF
    RNA and protein molecules undergo a dynamic folding process that is important to their function. Computational methods are critical for studying this folding pro- cess because it is difficult to observe experimentally. In this work, we introduce new computational techniques to study RNA and protein energy landscapes, includ- ing a method to approximate an RNA energy landscape with a coarse graph (map) and new tools for analyzing graph-based approximations of RNA and protein energy landscapes. These analysis techniques can be used to study RNA and protein fold- ing kinetics such as population kinetics, folding rates, and the folding of particular subsequences. In particular, a map-based Master Equation (MME) method can be used to analyze the population kinetics of the maps, while another map analysis tool, map-based Monte Carlo (MMC) simulation, can extract stochastic folding pathways from the map. To validate the results, I compared our methods with other computational meth- ods and with experimental studies of RNA and protein. I first compared our MMC and MME methods for RNA with other computational methods working on the com- plete energy landscape and show that the approximate map captures the major fea- tures of a much larger (e.g., by orders of magnitude) complete energy landscape. Moreover, I show that the methods scale well to large molecules, e.g., RNA with 200+ nucleotides. Then, I correlate the computational results with experimental findings. I present comparisons with two experimental cases to show how I can pre- dict kinetics-based functional rates of ColE1 RNAII and MS2 phage RNA and their mutants using our MME and MMC tools respectively. I also show that the MME and MMC tools can be applied to map-based approximations of protein energy energy landscapes and present kinetics analysis results for several proteins

    Computing the Partition Function for Kinetically Trapped RNA Secondary Structures

    Get PDF
    An RNA secondary structure is locally optimal if there is no lower energy structure that can be obtained by the addition or removal of a single base pair, where energy is defined according to the widely accepted Turner nearest neighbor model. Locally optimal structures form kinetic traps, since any evolution away from a locally optimal structure must involve energetically unfavorable folding steps. Here, we present a novel, efficient algorithm to compute the partition function over all locally optimal secondary structures of a given RNA sequence. Our software, RNAlocopt runs in time and space. Additionally, RNAlocopt samples a user-specified number of structures from the Boltzmann subensemble of all locally optimal structures. We apply RNAlocopt to show that (1) the number of locally optimal structures is far fewer than the total number of structures – indeed, the number of locally optimal structures approximately equal to the square root of the number of all structures, (2) the structural diversity of this subensemble may be either similar to or quite different from the structural diversity of the entire Boltzmann ensemble, a situation that depends on the type of input RNA, (3) the (modified) maximum expected accuracy structure, computed by taking into account base pairing frequencies of locally optimal structures, is a more accurate prediction of the native structure than other current thermodynamics-based methods. The software RNAlocopt constitutes a technical breakthrough in our study of the folding landscape for RNA secondary structures. For the first time, locally optimal structures (kinetic traps in the Turner energy model) can be rapidly generated for long RNA sequences, previously impossible with methods that involved exhaustive enumeration. Use of locally optimal structure leads to state-of-the-art secondary structure prediction, as benchmarked against methods involving the computation of minimum free energy and of maximum expected accuracy. Web server and source code available at http://bioinformatics.bc.edu/clotelab/RNAlocopt/

    Major Subject: Computer ScienceTECHNIQUES FOR MODELING AND ANALYZING RNA AND PROTEIN FOLDING ENERGY LANDSCAPES

    Get PDF
    Major Subject: Computer Scienceiii Techniques for Modeling and Analyzing RNA and Protein Folding Energ

    Transat—A Method for Detecting the Conserved Helices of Functional RNA Structures, Including Transient, Pseudo-Knotted and Alternative Structures

    Get PDF
    The prediction of functional RNA structures has attracted increased interest, as it allows us to study the potential functional roles of many genes. RNA structure prediction methods, however, assume that there is a unique functional RNA structure and also do not predict functional features required for in vivo folding. In order to understand how functional RNA structures form in vivo, we require sophisticated experiments or reliable prediction methods. So far, there exist only a few, experimentally validated transient RNA structures. On the computational side, there exist several computer programs which aim to predict the co-transcriptional folding pathway in vivo, but these make a range of simplifying assumptions and do not capture all features known to influence RNA folding in vivo. We want to investigate if evolutionarily related RNA genes fold in a similar way in vivo. To this end, we have developed a new computational method, Transat, which detects conserved helices of high statistical significance. We introduce the method, present a comprehensive performance evaluation and show that Transat is able to predict the structural features of known reference structures including pseudo-knotted ones as well as those of known alternative structural configurations. Transat can also identify unstructured sub-sequences bound by other molecules and provides evidence for new helices which may define folding pathways, supporting the notion that homologous RNA sequence not only assume a similar reference RNA structure, but also fold similarly. Finally, we show that the structural features predicted by Transat differ from those assuming thermodynamic equilibrium. Unlike the existing methods for predicting folding pathways, our method works in a comparative way. This has the disadvantage of not being able to predict features as function of time, but has the considerable advantage of highlighting conserved features and of not requiring a detailed knowledge of the cellular environment

    Multi-directional Rapidly Exploring Random Graph (mRRG) for Motion Planning

    Get PDF
    The motion planning problem in robotics is to find a valid sequence of motions taking some movable object from a start configuration to a goal configuration in an environment. Sampling-based path planners are very popular for high-dimensional motion planning in complex environments. These planners build a graph (roadmap) by generating robot configurations (vertices), and connecting nearby pairs of configurations according to their transition feasibility. Tree-based sampling-based planners (e.g., Rapidly-Exploring Random Tree, or RRT) start growing a tree outward from an initial configuration of the robot. In this work, we propose a multi-directional Rapidly-Exploring Random Graph (mRRG) for robotic motion planning, a variant of the Rapidly-Exploring Random Graph (RRG). Instead of expanding a vertex in the tree in a single random direction during each iteration, mRRG expands in m random directions. Our results show that growing in multiple directions in this way produces roadmaps with more topologically distinct paths than previous methods. In an environment with dynamic obstacles, moving or new obstacles may invalidate a path from the start to the goal. Hence, roadmaps containing alternative pathways can be beneficial as they may avoid recalculation of new valid paths. One of the important phases in sampling-based methods involves finding candidate nearest neighbors to attempt to connect to a node. Generally, the entire graph is considered to search for the nearest neighbors. In this thesis, we propose a heuristic method for finding nearest neighbors based on the hop limit, i.e., the maximum number of edges allowed in the path from a vertex to its neighbor. The candidate nearest neighbors are found by considering only those vertices within the hop limit. We experimentally show that our hop limit neighbor finder significantly reduces neighbor searching time over the standard brute force approach when constructing roadmaps

    Uniform Sampling Framework for Sampling Based Motion Planning and Its Applications to Robotics and Protein Ligand Binding

    Get PDF
    Sampling-based motion planning aims to find a valid path from a start to a goal by sampling in the planning space. Planning on surfaces is an important problem in many research problems, including traditional robotics and computational biology. It is also a difficult research question to plan on surfaces as the surface is only a small subspace of the entire planning space. For example, robots are currently widely used for product assembly. Contact between the robot manipulator and the product are required to assemble each piece precisely. The configurations in which the robot fingers are in contact with the object form a surface in the planning space. However, these configurations are only a small proportion of all possible robot configurations. Several sampling-based motion planners aim to bias sampling to specific surfaces, such as Cobst surfaces, as needed for tasks requiring contact, or along the medial axis, which maximizes clearance. While some of these methods work well in practice, none of them are able to provide any information regarding the distribution of the samples they generate. It would be interesting and useful to know, for example, that a particular surface has been sampled uniformly so that one could argue regarding the probability of finding a path on that surface. Unfortunately, despite great interest for nearly two decades, it has remained an open problem to develop a method for sampling on such surfaces that can provide any information regarding the distribution of the resulting samples. Our research focuses on solving this open problem and introduces a framework that is guaranteed to uniformly sample any surface in Cspace. Instead of explicitly constructing the target surfaces, which is generally intractable, our uniform sampling framework only requires detecting intersections between a line segment and the target surface, which can often be done efficiently. Intuitively, since we uniformly distribute the line segments, the intersections between the segments and the surfaces will also be uniformly distributed. We present two particular instances of the framework: Uniform Obstacle-based PRM (UOBPRM) that uniformly samples Cobst surfaces, and Uniform Medial-Axis PRM (UMAPRM) that uniformly samples the Cspace medial axis. We provide a theoretical analysis for this framework that establishes uniformity and probabilistic completeness and also the probability of sampling in narrow passages. We show applications of this uniform sampling framework in robotics (both UOBPRM and UMAPRM) and in biology (UOBPRM). We are able to solve some difficult motion planning problems more efficiently than other sampling methods, including PRM, OBPRM, Gaussian PRM, Bridge Test PRM, and MAPRM. Moreover, we show that UOBPRM and UMAPRM have similar computational overhead as other approaches. UOBPRM is used to study the ligand binding affinity ranking problem in computational biology. Our experimental results show that UOBPRM is a potential technique to rank ligand binding affinity which can be further applied as a cost-saving tool for pharmaceutical companies to narrow the search for drug candidates

    Prédiction structurale de biomolécules à l'aide d'une construction d'automates cellulaires simulant la dynamique moléculaire

    Full text link
    Thèse numérisée par la Division de la gestion de documents et des archives de l'Université de Montréal

    Intelligent Motion Planning and Analysis with Probabilistic Roadmap Methods for the Study of Complex and High-Dimensional Motions

    Get PDF
    At first glance, robots and proteins have little in common. Robots are commonly thought of as tools that perform tasks such as vacuuming the floor, while proteins play essential roles in many biochemical processes. However, the functionality of both robots and proteins is highly dependent on their motions. In order to study motions in these two divergent domains, the same underlying algorithmic framework can be applied. This method is derived from probabilistic roadmap methods (PRMs) originally developed for robotic motion planning. It builds a graph, or roadmap, where configurations are represented as vertices and transitions between configurations are edges. The contribution of this work is a set of intelligent methods applied to PRMs. These methods facilitate both the modeling and analysis of motions, and have enabled the study of complex and high-dimensional problems in both robotic and molecular domains. In order to efficiently study biologically relevant molecular folding behaviors we have developed new techniques based on Monte Carlo solution, master equation calculation, and non-linear dimensionality reduction to run simulations and analysis on the roadmap. The first method, Map-based master equation calculation (MME), extracts global properties of the folding landscape such as global folding rates. On the other hand, another method, Map-based Monte Carlo solution (MMC), can be used to extract microscopic features of the folding process. Also, the application of dimensionality reduction returns a lower-dimensional representation that still retains the principal features while facilitating both modeling and analysis of motion landscapes. A key contribution of our methods is the flexibility to study larger and more complex structures, e.g., 372 residue Alpha-1 antitrypsin and 200 nucleotide ColE1 RNAII. We also applied intelligent roadmap-based techniques to the area of robotic motion. These methods take advantage of unsupervised learning methods at all stages of the planning process and produces solutions in complex spaces with little cost and less manual intervention compared to other adaptive methods. Our results show that our methods have low overhead and that they out-perform two existing adaptive methods in all complex cases studied
    corecore