12,640 research outputs found
The RNA Newton Polytope and Learnability of Energy Parameters
Despite nearly two scores of research on RNA secondary structure and RNA-RNA
interaction prediction, the accuracy of the state-of-the-art algorithms are
still far from satisfactory. Researchers have proposed increasingly complex
energy models and improved parameter estimation methods in anticipation of
endowing their methods with enough power to solve the problem. The output has
disappointingly been only modest improvements, not matching the expectations.
Even recent massively featured machine learning approaches were not able to
break the barrier. In this paper, we introduce the notion of learnability of
the parameters of an energy model as a measure of its inherent capability. We
say that the parameters of an energy model are learnable iff there exists at
least one set of such parameters that renders every known RNA structure to date
the minimum free energy structure. We derive a necessary condition for the
learnability and give a dynamic programming algorithm to assess it. Our
algorithm computes the convex hull of the feature vectors of all feasible
structures in the ensemble of a given input sequence. Interestingly, that
convex hull coincides with the Newton polytope of the partition function as a
polynomial in energy parameters. We demonstrated the application of our theory
to a simple energy model consisting of a weighted count of A-U and C-G base
pairs. Our results show that this simple energy model satisfies the necessary
condition for less than one third of the input unpseudoknotted
sequence-structure pairs chosen from the RNA STRAND v2.0 database. For another
one third, the necessary condition is barely violated, which suggests that
augmenting this simple energy model with more features such as the Turner loops
may solve the problem. The necessary condition is severely violated for 8%,
which provides a small set of hard cases that require further investigation
Thermodynamic Analysis of Interacting Nucleic Acid Strands
Motivated by the analysis of natural and engineered DNA and RNA systems, we present the first algorithm for calculating the partition function of an unpseudoknotted complex of multiple interacting nucleic acid strands. This dynamic program is based on a rigorous extension of secondary structure models to the multistranded case, addressing representation and distinguishability issues that do not arise for single-stranded structures. We then derive the form of the partition function for a fixed volume containing a dilute solution of nucleic acid complexes. This expression can be evaluated explicitly for small numbers of strands, allowing the calculation of the equilibrium population distribution for each species of complex. Alternatively, for large systems (e.g., a test tube), we show that the unique complex concentrations corresponding to thermodynamic equilibrium can be obtained by solving a convex programming problem. Partition function and concentration information can then be used to calculate equilibrium base-pairing observables. The underlying physics and mathematical formulation of these problems lead to an interesting blend of approaches, including ideas from graph theory, group theory, dynamic programming, combinatorics, convex optimization, and Lagrange duality
A new procedure to analyze RNA non-branching structures
RNA structure prediction and structural motifs analysis are challenging tasks in the investigation of RNA function. We propose a novel procedure to detect structural motifs shared between two RNAs (a reference and a target). In particular, we developed two core modules: (i) nbRSSP_extractor, to assign a unique structure to the reference RNA encoded by a set of non-branching structures; (ii) SSD_finder, to detect structural motifs that the target RNA shares with the reference, by means of a new score function that rewards the relative distance of the target non-branching structures compared to the reference ones. We integrated these algorithms with already existing software to reach a coherent pipeline able to perform the following two main tasks: prediction of RNA structures (integration of RNALfold and nbRSSP_extractor) and search for chains of matches (integration of Structator and SSD_finder)
A Graph Grammar for Modelling RNA Folding
We propose a new approach for modelling the process of RNA folding as a graph
transformation guided by the global value of free energy. Since the folding
process evolves towards a configuration in which the free energy is minimal,
the global behaviour resembles the one of a self-adaptive system. Each RNA
configuration is a graph and the evolution of configurations is constrained by
precise rules that can be described by a graph grammar.Comment: In Proceedings GaM 2016, arXiv:1612.0105
Paradigms for computational nucleic acid design
The design of DNA and RNA sequences is critical for many endeavors, from DNA nanotechnology, to PCR‐based applications, to DNA hybridization arrays. Results in the literature rely on a wide variety of design criteria adapted to the particular requirements of each application. Using an extensively studied thermodynamic model, we perform a detailed study of several criteria for designing sequences intended to adopt a target secondary structure. We conclude that superior design methods should explicitly implement both a positive design paradigm (optimize affinity for the target structure) and a negative design paradigm (optimize specificity for the target structure). The commonly used approaches of sequence symmetry minimization and minimum free‐energy satisfaction primarily implement negative design and can be strengthened by introducing a positive design component. Surprisingly, our findings hold for a wide range of secondary structures and are robust to modest perturbation of the thermodynamic parameters used for evaluating sequence quality, suggesting the feasibility and ongoing utility of a unified approach to nucleic acid design as parameter sets are refined further. Finally, we observe that designing for thermodynamic stability does not determine folding kinetics, emphasizing the opportunity for extending design criteria to target kinetic features of the energy landscape
Prediction of RNA pseudoknots by Monte Carlo simulations
In this paper we consider the problem of RNA folding with pseudoknots. We use
a graphical representation in which the secondary structures are described by
planar diagrams. Pseudoknots are identified as non-planar diagrams. We analyze
the non-planar topologies of RNA structures and propose a classification of RNA
pseudoknots according to the minimal genus of the surface on which the RNA
structure can be embedded. This classification provides a simple and natural
way to tackle the problem of RNA folding prediction in presence of pseudoknots.
Based on that approach, we describe a Monte Carlo algorithm for the prediction
of pseudoknots in an RNA molecule.Comment: 22 pages, 14 figure
Exact Learning of RNA Energy Parameters From Structure
We consider the problem of exact learning of parameters of a linear RNA
energy model from secondary structure data. A necessary and sufficient
condition for learnability of parameters is derived, which is based on
computing the convex hull of union of translated Newton polytopes of input
sequences. The set of learned energy parameters is characterized as the convex
cone generated by the normal vectors to those facets of the resulting polytope
that are incident to the origin. In practice, the sufficient condition may not
be satisfied by the entire training data set; hence, computing a maximal subset
of training data for which the sufficient condition is satisfied is often
desired. We show that problem is NP-hard in general for an arbitrary
dimensional feature space. Using a randomized greedy algorithm, we select a
subset of RNA STRAND v2.0 database that satisfies the sufficient condition for
separate A-U, C-G, G-U base pair counting model. The set of learned energy
parameters includes experimentally measured energies of A-U, C-G, and G-U
pairs; hence, our parameter set is in agreement with the Turner parameters
Model-guided design of ligand-regulated RNAi for programmable control of gene expression
Progress in constructing biological networks will rely on the development of more advanced components that can be predictably modified to yield optimal system performance. We have engineered an RNA-based platform, which we call an shRNA switch, that provides for integrated ligand control of RNA interference (RNAi) by modular coupling of an aptamer, competing strand, and small hairpin (sh) RNA stem into a single component that links ligand concentration and target gene expression levels. A combined experimental and mathematical modelling approach identified multiple tuning strategies and moves towards a predictable framework for the forward design of shRNA switches. The utility of our platform is highlighted by the demonstration of fine-tuning, multi-input control, and model-guided design of shRNA switches with an optimized dynamic range. Thus, shRNA switches can serve as an advanced component for the construction of complex biological systems and offer a controlled means of activating RNAi in disease therapeutics
- …