39 research outputs found

    RSRE: RNA structural robustness evaluator

    Get PDF
    Biological robustness, defined as the ability to maintain stable functioning in the face of various perturbations, is an important and fundamental topic in current biology, and has become a focus of numerous studies in recent years. Although structural robustness has been explored in several types of RNA molecules, the origins of robustness are still controversial. Computational analysis results are needed to make up for the lack of evidence of robustness in natural biological systems. The RNA structural robustness evaluator (RSRE) web server presented here provides a freely available online tool to quantitatively evaluate the structural robustness of RNA based on the widely accepted definition of neutrality. Several classical structure comparison methods are employed; five randomization methods are implemented to generate control sequences; sub-optimal predicted structures can be optionally utilized to mitigate the uncertainty of secondary structure prediction. With a user-friendly interface, the web application is easy to use. Intuitive illustrations are provided along with the original computational results to facilitate analysis. The RSRE will be helpful in the wide exploration of RNA structural robustness and will catalyze our understanding of RNA evolution. The RSRE web server is freely available at http://biosrv1.bmi.ac.cn/RSRE/ or http://biotech.bmi.ac.cn/RSRE/

    Efficient chaining of seeds in ordered trees

    Get PDF
    We consider here the problem of chaining seeds in ordered trees. Seeds are mappings between two trees Q and T and a chain is a subset of non overlapping seeds that is consistent with respect to postfix order and ancestrality. This problem is a natural extension of a similar problem for sequences, and has applications in computational biology, such as mining a database of RNA secondary structures. For the chaining problem with a set of m constant size seeds, we describe an algorithm with complexity O(m2 log(m)) in time and O(m2) in space

    A Satisfiability-based Approach for Embedding Generalized Tanglegrams on Level Graphs

    Get PDF
    A tanglegram is a pair of trees on the same set of leaves with matching leaves in the two trees joined by an edge. Tanglegrams are widely used in computational biology to compare evolutionary histories of species. In this paper we present a formulation of two related combinatorial embedding problems concerning tanglegrams in terms of CNF-formulas. The first problem is known as planar embedding and the second as crossing minimization problem. We show that our satisfiability formulation of these problems can handle a much more general case with more than two, not necessarily binary or complete, trees defined on arbitrary sets of leaves and allowed to vary their layouts

    Classifying tree structures using elastic matching of sequence encodings

    Get PDF
    This document is the Accepted Manuscript version of the following article: Angeliki Skoura, Iosif Mporas, Vasileios Megalooikonomou, ‘Classifying tree structures using elastic matching of sequence encodings’, Neurocomputing, Vol. 163, pp. 151-159, February 2015. The Version of Record is available online at: DOI: https://doi.org/10.1016/j.neucom.2014.08.083. This Manuscript version is distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives License (http://creativecommons.org/licenses/by-nc-nd/4.0/ ), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited, and is not altered, transformed, or built upon in any way.Structures of tree topology are frequently encountered in nature and in a range of scientific domains. In this paper, a multi-step framework is presented to classify tree topologies introducing the idea of elastic matching of their sequence encodings. Initially, representative sequences of the branching topologies are obtained using node labeling and tree traversal schemes. The similarity between tree topologies is then quantified by applying elastic matching techniques. The resulting sequence alignment reveals corresponding node groups providing a better understanding of matching tree topologies. The new similarity approach is explored using various classification algorithms and is applied to a medical dataset outperforming state-of-the-art techniques by at least 6.6% and 3.5% in terms of absolute specificity and accuracy correspondingly.Peer reviewe

    Accurate classification of RNA structures using topological fingerprints

    Get PDF
    While RNAs are well known to possess complex structures, functionally similar RNAs often have little sequence similarity. While the exact size and spacing of base-paired regions vary, functionally similar RNAs have pronounced similarity in the arrangement, or topology, of base-paired stems. Furthermore, predicted RNA structures often lack pseudoknots (a crucial aspect of biological activity), and are only partially correct, or incomplete. A topological approach addresses all of these difficulties. In this work we describe each RNA structure as a graph that can be converted to a topological spectrum (RNA fingerprint). The set of subgraphs in an RNA structure, its RNA fingerprint, can be compared with the fingerprints of other RNA structures to identify and correctly classify functionally related RNAs. Topologically similar RNAs can be identified even when a large fraction, up to 30%, of the stems are omitted, indicating that highly accurate structures are not necessary. We investigate the performance of the RNA fingerprint approach on a set of eight highly curated RNA families, with diverse sizes and functions, containing pseudoknots, and with little sequence similarity–an especially difficult test set. In spite of the difficult test set, the RNA fingerprint approach is very successful (ROC AUC \u3e 0.95). Due to the inclusion of pseudoknots, the RNA fingerprint approach both covers a wider range of possible structures than methods based only on secondary structure, and its tolerance for incomplete structures suggests that it can be applied even to predicted structures. Source code is freely available at https://github.rcac.purdue.edu/mgribsko/XIOS_RNA_fingerprint

    Accurate and efficient reconstruction of deep phylogenies from structured RNAs

    Get PDF
    Ribosomal RNA (rRNA) genes are probably the most frequently used data source in phylogenetic reconstruction. Individual columns of rRNA alignments are not independent as a consequence of their highly conserved secondary structures. Unless explicitly taken into account, these correlation can distort the phylogenetic signal and/or lead to gross overestimates of tree stability. Maximum likelihood and Bayesian approaches are of course amenable to using RNA-specific substitution models that treat conserved base pairs appropriately, but require accurate secondary structure models as input. So far, however, no accurate and easy-to-use tool has been available for computing structure-aware alignments and consensus structures that can deal with the large rRNAs. The RNAsalsa approach is designed to fill this gap. Capitalizing on the improved accuracy of pairwise consensus structures and informed by a priori knowledge of group-specific structural constraints, the tool provides both alignments and consensus structures that are of sufficient accuracy for routine phylogenetic analysis based on RNA-specific substitution models. The power of the approach is demonstrated using two rRNA data sets: a mitochondrial rRNA set of 26 Mammalia, and a collection of 28S nuclear rRNAs representative of the five major echinoderm groups

    Entropy Bounds for Hierarchical Molecular Networks

    Get PDF
    In this paper we derive entropy bounds for hierarchical networks. More precisely, starting from a recently introduced measure to determine the topological entropy of non-hierarchical networks, we provide bounds for estimating the entropy of hierarchical graphs. Apart from bounds to estimate the entropy of a single hierarchical graph, we see that the derived bounds can also be used for characterizing graph classes. Our contribution is an important extension to previous results about the entropy of non-hierarchical networks because for practical applications hierarchical networks are playing an important role in chemistry and biology. In addition to the derivation of the entropy bounds, we provide a numerical analysis for two special graph classes, rooted trees and generalized trees, and demonstrate hereby not only the computational feasibility of our method but also learn about its characteristics and interpretability with respect to data analysis
    corecore