469 research outputs found

    Pair HMM based gap statistics for re-evaluation of indels in alignments with affine gap penalties: Extended Version

    Full text link
    Although computationally aligning sequence is a crucial step in the vast majority of comparative genomics studies our understanding of alignment biases still needs to be improved. To infer true structural or homologous regions computational alignments need further evaluation. It has been shown that the accuracy of aligned positions can drop substantially in particular around gaps. Here we focus on re-evaluation of score-based alignments with affine gap penalty costs. We exploit their relationships with pair hidden Markov models and develop efficient algorithms by which to identify gaps which are significant in terms of length and multiplicity. We evaluate our statistics with respect to the well-established structural alignments from SABmark and find that indel reliability substantially increases with their significance in particular in worst-case twilight zone alignments. This points out that our statistics can reliably complement other methods which mostly focus on the reliability of match positions.Comment: 17 pages, 7 figure

    An efficient algorithm for sequence comparison with block reversals

    Get PDF
    AbstractGiven two sequences X and Y that are strings over some alphabet set, we consider the distance d(X,Y) between them defined to be minimum number of character replacements and block (substring) reversals needed to transform X to Y (or vice versa). The operations are required to be disjoint. This is the “simplest” sequence comparison problem we know of that allows natural block edit operations. Block reversals arise naturally in genomic sequence comparison; they are also of interest in matching music data. We present an algorithm for exactly computing the distance d(X,Y); it takes time O(|X|log2|X|), and hence, is near-linear. Trivial approach takes quadratic time

    Fast prediction of RNA-RNA interaction

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Regulatory antisense RNAs are a class of ncRNAs that regulate gene expression by prohibiting the translation of an mRNA by establishing stable interactions with a target sequence. There is great demand for efficient computational methods to predict the specific interaction between an ncRNA and its target mRNA(s). There are a number of algorithms in the literature which can predict a variety of such interactions - unfortunately at a very high computational cost. Although some existing target prediction approaches are much faster, they are specialized for interactions with a single binding site.</p> <p>Methods</p> <p>In this paper we present a novel algorithm to accurately predict the minimum free energy structure of RNA-RNA interaction under the most general type of interactions studied in the literature. Moreover, we introduce a fast heuristic method to predict the specific (multiple) binding sites of two interacting RNAs.</p> <p>Results</p> <p>We verify the performance of our algorithms for joint structure and binding site prediction on a set of known interacting RNA pairs. Experimental results show our algorithms are highly accurate and outperform all competitive approaches.</p

    Effect of Composition on the Spontaneous Emission Probabilities, simulated Emission Cross Sections and Local Environment of Tm+3, in Teo2-Wo3 Glass'

    Get PDF
    Cataloged from PDF version of article.Effect of composition on the structure, spontaneous and stimulated emission probabilities of various 1.0 mol% Tm2O3 doped (1 - x)TeO2 + (x)WO3 glasses were investigated using Raman spectroscopy, ultraviolet-visible-near-infrared (UV/VIS/NIR) absorption and luminescence measurements. Absorption measurements in the UV/VIS/NIR region were used to determine spontaneous emission probabilities for the 4f-4f transitions of Tm3+ ions. Six absorption bands corresponding to the absorption of the (1)G(4), F-3(2), F-3(3) and F-3(4), H-3(5) and H-3(4) levels from the H-3(6) ground level were observed. Integrated absorption cross-section of each band except that of H-3(5) level was found to vary with the glass composition. Luminescence spectra of the samples were measured upon 457.9 nm excitation. Three emission bands centered at 476 nm ((1)G(4) --> H-3(6) transition), 651 nm ((1)G(4) --> H-3(4) transition) and 800 nm ((1)G(4) --> H-3(5) transition) were observed. Spontaneous emission cross-sections together with the luminescence spectra measured upon 457.9 nm excitation were used to determine the stimulated emission cross-sections of these emissions. The effect of glass composition on the Judd-Ofelt parameters and therefore on the spontaneous and the stimulated emission cross-sections for the metastable levels of Tm3+ ions were discussed in detail. The effect of temperature on the stimulated emission cross-sections for the emissions observed upon 457.9 nm excitation was also discussed. (C) 2002 Elsevier Science B.V. All rights reserved

    Task-Oriented Active Sensing via Action Entropy Minimization

    Get PDF
    This work is licensed under a Creative Commons Attribution 4.0 International License.In active sensing, sensing actions are typically chosen to minimize the uncertainty of the state according to some information-theoretic measure such as entropy, conditional entropy, mutual information, etc. This is reasonable for applications where the goal is to obtain information. However, when the information about the state is used to perform a task, minimizing state uncertainty may not lead to sensing actions that provide the information that is most useful to the task. This is because the uncertainty in some subspace of the state space could have more impact on the performance of the task than others, and this dependence can vary at different stages of the task. One way to combine task, uncertainty, and sensing, is to model the problem as a sequential decision making problem under uncertainty. Unfortunately, the solutions to these problems are computationally expensive. This paper presents a new task-oriented active sensing scheme, where the task is taken into account in sensing action selection by choosing sensing actions that minimize the uncertainty in future task-related actions instead of state uncertainty. The proposed method is validated via simulations

    A Multi-labeled Tree Edit Distance for Comparing "Clonal Trees" of Tumor Progression

    Get PDF
    We introduce a new edit distance measure between a pair of "clonal trees", each representing the progression and mutational heterogeneity of a tumor sample, constructed by the use of single cell or bulk high throughput sequencing data. In a clonal tree, each vertex represents a specific tumor clone, and is labeled with one or more mutations in a way that each mutation is assigned to the oldest clone that harbors it. Given two clonal trees, our multi-labeled tree edit distance (MLTED) measure is defined as the minimum number of mutation/label deletions, (empty) leaf deletions, and vertex (clonal) expansions, applied in any order, to convert each of the two trees to the maximal common tree. We show that the MLTED measure can be computed efficiently in polynomial time and it captures the similarity between trees of different clonal granularity well. We have implemented our algorithm to compute MLTED exactly and applied it to a variety of data sets successfully. The source code of our method can be found in: https://github.com/khaled-rahman/leafDelTED

    Mirroring co-evolving trees in the light of their topologies

    Full text link
    Determining the interaction partners among protein/domain families poses hard computational problems, in particular in the presence of paralogous proteins. Available approaches aim to identify interaction partners among protein/domain families through maximizing the similarity between trimmed versions of their phylogenetic trees. Since maximization of any natural similarity score is computationally difficult, many approaches employ heuristics to maximize the distance matrices corresponding to the tree topologies in question. In this paper we devise an efficient deterministic algorithm which directly maximizes the similarity between two leaf labeled trees with edge lengths, obtaining a score-optimal alignment of the two trees in question. Our algorithm is significantly faster than those methods based on distance matrix comparison: 1 minute on a single processor vs. 730 hours on a supercomputer. Furthermore we have advantages over the current state-of-the-art heuristic search approach in terms of precision as well as a recently suggested overall performance measure for mirrortree approaches, while incurring only acceptable losses in recall. A C implementation of the method demonstrated in this paper is available at http://compbio.cs.sfu.ca/mirrort.htmComment: 13 pages, 2 figures, Iman Hajirasouliha and Alexander Sch\"onhuth are joint first author

    Sparsification of RNA structure prediction including pseudoknots

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Although many RNA molecules contain pseudoknots, computational prediction of pseudoknotted RNA structure is still in its infancy due to high running time and space consumption implied by the dynamic programming formulations of the problem.</p> <p>Results</p> <p>In this paper, we introduce sparsification to significantly speedup the dynamic programming approaches for pseudoknotted RNA structure prediction, which also lower the space requirements. Although sparsification has been applied to a number of RNA-related structure prediction problems in the past few years, we provide the first application of sparsification to pseudoknotted RNA structure prediction specifically and to handling gapped fragments more generally - which has a much more complex recursive structure than other problems to which sparsification has been applied. We analyse how to sparsify four pseudoknot structure prediction algorithms, among those the most general method available (the Rivas-Eddy algorithm) and the fastest one (Reeder-Giegerich algorithm). In all algorithms the number of "candidate" substructures to be considered is reduced.</p> <p>Conclusions</p> <p>Our experimental results on the sparsified Reeder-Giegerich algorithm suggest a linear speedup over the unsparsified implementation.</p
    corecore