10,266 research outputs found

    Comparing RNA structures using a full set of biologically relevant edit operations is intractable

    Get PDF
    7 pagesArc-annotated sequences are useful for representing structural information of RNAs and have been extensively used for comparing RNA structures in both terms of sequence and structural similarities. Among the many paradigms referring to arc-annotated sequences and RNA structures comparison (see \cite{IGMA_BliDenDul08} for more details), the most important one is the general edit distance. The problem of computing an edit distance between two non-crossing arc-annotated sequences was introduced in \cite{Evans99}. The introduced model uses edit operations that involve either single letters or pairs of letters (never considered separately) and is solvable in polynomial-time \cite{ZhangShasha:1989}. To account for other possible RNA structural evolutionary events, new edit operations, allowing to consider either silmutaneously or separately letters of a pair were introduced in \cite{jiangli}; unfortunately at the cost of computational tractability. It has been proved that comparing two RNA secondary structures using a full set of biologically relevant edit operations is {\sf\bf NP}-complete. Nevertheless, in \cite{DBLP:conf/spire/GuignonCH05}, the authors have used a strong combinatorial restriction in order to compare two RNA stem-loops with a full set of biologically relevant edit operations; which have allowed them to design a polynomial-time and space algorithm for comparing general secondary RNA structures. In this paper we will prove theoretically that comparing two RNA structures using a full set of biologically relevant edit operations cannot be done without strong combinatorial restrictions

    Louse (Insecta : Phthiraptera) mitochondrial 12S rRNA secondary structure is highly variable

    Get PDF
    Lice are ectoparasitic insects hosted by birds and mammals. Mitochondrial 12S rRNA sequences obtained from lice show considerable length variation and are very difficult to align. We show that the louse 12S rRNA domain III secondary structure displays considerable variation compared to other insects, in both the shape and number of stems and loops. Phylogenetic trees constructed from tree edit distances between louse 12S rRNA structures do not closely resemble trees constructed from sequence data, suggesting that at least some of this structural variation has arisen independently in different louse lineages. Taken together with previous work on mitochondrial gene order and elevated rates of substitution in louse mitochondrial sequences, the structural variation in louse 12S rRNA confirms the highly distinctive nature of molecular evolution in these insects

    Efficient chaining of seeds in ordered trees

    Get PDF
    We consider here the problem of chaining seeds in ordered trees. Seeds are mappings between two trees Q and T and a chain is a subset of non overlapping seeds that is consistent with respect to postfix order and ancestrality. This problem is a natural extension of a similar problem for sequences, and has applications in computational biology, such as mining a database of RNA secondary structures. For the chaining problem with a set of m constant size seeds, we describe an algorithm with complexity O(m2 log(m)) in time and O(m2) in space

    Reconstructing phylogeny from RNA secondary structure via simulated evolution

    No full text
    DNA sequences of genes encoding functional RNA molecules (e.g., ribosomal RNAs) are commonly used in phylogenetics (i.e. to infer evolutionary history). Trees derived from ribosomal RNA (rRNA) sequences, however, are inconsistent with other molecular data in investigations of deep branches in the tree of life. Since much of te functional constraints on the gene products (i.e. RNA molecules) relate to three-dimensional structure, rather than their actual sequences, accumulated mutations in the gene sequences may obscure phylogenetic signal over very large evolutionary time-scales. Variation in structure, however, may be suitable for phylogenetic inference even under extreme sequence divergence. To evaluate qualitatively the manner in which structural evolution relates to sequence change, we simulated the evolution of RNA sequences under various constraints on structural change

    Tree decomposition and parameterized algorithms for RNA structure-sequence alignment including tertiary interactions and pseudoknots

    Get PDF
    We present a general setting for structure-sequence comparison in a large class of RNA structures that unifies and generalizes a number of recent works on specific families on structures. Our approach is based on tree decomposition of structures and gives rises to a general parameterized algorithm, where the exponential part of the complexity depends on the family of structures. For each of the previously studied families, our algorithm has the same complexity as the specific algorithm that had been given before.Comment: (2012

    Improved Algorithms for Approximate String Matching (Extended Abstract)

    Get PDF
    The problem of approximate string matching is important in many different areas such as computational biology, text processing and pattern recognition. A great effort has been made to design efficient algorithms addressing several variants of the problem, including comparison of two strings, approximate pattern identification in a string or calculation of the longest common subsequence that two strings share. We designed an output sensitive algorithm solving the edit distance problem between two strings of lengths n and m respectively in time O((s-|n-m|)min(m,n,s)+m+n) and linear space, where s is the edit distance between the two strings. This worst-case time bound sets the quadratic factor of the algorithm independent of the longest string length and improves existing theoretical bounds for this problem. The implementation of our algorithm excels also in practice, especially in cases where the two strings compared differ significantly in length. Source code of our algorithm is available at http://www.cs.miami.edu/\~dimitris/edit_distanceComment: 10 page

    Geometric medians in reconciliation spaces

    Get PDF
    In evolutionary biology, it is common to study how various entities evolve together, for example, how parasites coevolve with their host, or genes with their species. Coevolution is commonly modelled by considering certain maps or reconciliations from one evolutionary tree PP to another HH, all of which induce the same map ϕ\phi between the leaf-sets of PP and HH (corresponding to present-day associations). Recently, there has been much interest in studying spaces of reconciliations, which arise by defining some metric dd on the set Rec(P,H,ϕ)Rec(P,H,\phi) of all possible reconciliations between PP and HH. In this paper, we study the following question: How do we compute a geometric median for a given subset Ψ\Psi of Rec(P,H,ϕ)Rec(P,H,\phi) relative to dd, i.e. an element ψmedRec(P,H,ϕ)\psi_{med} \in Rec(P,H,\phi) such that ψΨd(ψmed,ψ)ψΨd(ψ,ψ) \sum_{\psi' \in \Psi} d(\psi_{med},\psi') \le \sum_{\psi' \in \Psi} d(\psi,\psi') holds for all ψRec(P,H,ϕ)\psi \in Rec(P,H,\phi)? For a model where so-called host-switches or transfers are not allowed, and for a commonly used metric dd called the edit-distance, we show that although the cardinality of Rec(P,H,ϕ)Rec(P,H,\phi) can be super-exponential, it is still possible to compute a geometric median for a set Ψ\Psi in Rec(P,H,ϕ)Rec(P,H,\phi) in polynomial time. We expect that this result could be useful for computing a summary or consensus for a set of reconciliations (e.g. for a set of suboptimal reconciliations).Comment: 12 pages, 1 figur
    corecore