128 research outputs found

    Foldalign 2.5:multithreaded implementation for pairwise structural RNA alignment

    Get PDF
    Motivation: Structured RNAs can be hard to search for as they often are not well conserved in their primary structure and are local in their genomic or transcriptomic context. Thus, the need for tools which in particular can make local structural alignments of RNAs is only increasing. Results: To meet the demand for both large-scale screens and hands on analysis through web servers, we present a new multithreaded version of Foldalign. We substantially improve execution time while maintaining all previous functionalities, including carrying out local structural alignments of sequences with low similarity. Furthermore, the improvements allow for comparing longer RNAs and increasing the sequence length. For example, lengths in the range 2000–6000 nucleotides improve execution up to a factor of five. Availability and implementation: The Foldalign software and the web server are available at http://rth.dk/resources/foldalign Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online

    Constrained Secondary Structure Prediction Using Stem Detection

    Get PDF
    RNA sequence analysis and structure prediction are classical topics of computational biology and a powerful tool to examine complex genomic data. Over the decades, various tools have been developed to predict RNA secondary structures and sequence alignments, a majority of which utilize one of the two characteristic approaches: (a) thermodynamic minimum free energy or (b) probabilistic maximum likelihood prediction. However, despite numerous takes on modeling these approaches, the computational complexity of the developed algorithms hasn’t seen significant improvements. Most algorithms still operate with a polynomial time complexity of O(N3?). This cost is significantly large while processing large RNA sequences with hundreds of bases. In this thesis, a constrained structure prediction algorithm is presented that aims to diminish the computational overhead of traditional RNA structure prediction methods to O(N?2). The proposed algorithm employs pattern recognition methods to devise rules for constructing a confined space of possible secondary structures. This confined structure space is then searched to find a secondary structure that satisfies the optimality criterion. Through this document, we present the design details of the proposed algorithm implemented using the minimum free energy (MFE) model. Later, we compare its performance to Zuker’s algorithm which is the conventional dynamic programming equivalent of the MFE model. The proposed algorithm provides a significant reduction in CPU time to process longer sequences which can be attributed to its lower computational complexity

    New Computational Approaches For Multiple Rna Alignment And Rna Search

    Get PDF
    In this thesis we explore the the theory and history behind RNA alignment. Normal sequence alignments as studied by computer scientists can be completed in O(n2) time in the naive case. The process involves taking two input sequences and finding the list of edits that can transform one sequence into the other. This process is applied to biology in many forms, such as the creation of multiple alignments and the search of genomic sequences. When you take into account the RNA sequence structure the problem becomes even harder. Multiple RNA structure alignment is particularly challenging because covarying mutations make sequence information alone insufficient. Existing tools for multiple RNA alignments first generate pair-wise RNA structure alignments and then build the multiple alignment using only the sequence information. Here we present PMFastR, an algorithm which iteratively uses a sequence-structure alignment procedure to build a multiple RNA structure alignment. PMFastR also has low memory consumption allowing for the alignment of large sequences such as 16S and 23S rRNA. Specifically, we reduce the memory consumption to ∼O(band2 ∗ m) where band is the banding size. Other solutions are ∼ O(n2 ∗ m) where n and m are the lengths of the target and query respectively. The algorithm also provides a method to utilize a multi-core environment. We present results on benchmark data sets from BRAliBase, which shows PMFastR outperforms other state-of-the-art programs. Furthermore, we regenerate 607 Rfam seed alignments and show that our automated process creates similar multiple alignments to the manually-curated Rfam seed alignments. While these methods can also be applied directly to genome sequence search, the abundance of new multiple species genome alignments presents a new area for exploration. Many multiple alignments of whole genomes are available and these alignments keep growing in size. These alignments can provide more information to the searcher than just a single sequence. Using the methodology from sequence-structure alignment we developed AlnAlign, which searches an entire genome alignment using RNA sequence structure. While programs have been readily available to align alignments, this is the first to our knowledge that is specifically designed for RNA sequences. This algorithm is presented only in theory and is yet to be tested

    Revising the evolutionary imprint of RNA structure in mammalian genomes

    Get PDF

    RNA inverse folding and synthetic design

    Get PDF
    Thesis advisor: Welkin E. JohnsonThesis advisor: Peter G. CloteSynthetic biology currently is a rapidly emerging discipline, where innovative and interdisciplinary work has led to promising results. Synthetic design of RNA requires novel methods to study and analyze known functional molecules, as well as to generate design candidates that have a high likelihood of being functional. This thesis is primarily focused on the development of novel algorithms for the design of synthetic RNAs. Previous strategies, such as RNAinverse, NUPACK-DESIGN, etc. use heuristic methods, such as adaptive walk, ensemble defect optimization (a form of simulated annealing), genetic algorithms, etc. to generate sequences that minimize specific measures (probability of the target structure, ensemble defect). In contrast, our approach is to generate a large number of sequences whose minimum free energy structure is identical to the target design structure, and subsequently filter with respect to different criteria in order to select the most promising candidates for biochemical validation. In addition, our software must be made accessible and user-friendly, thus allowing researchers from different backgrounds to use our software in their work. Therefore, the work presented in this thesis concerns three areas: Create a potent, versatile and user friendly RNA inverse folding algorithm suitable for the specific requirements of each project, implement tools to analyze the properties that differentiate known functional RNA structures, and use these methods for synthetic design of de-novo functional RNA molecules.Thesis (PhD) — Boston College, 2016.Submitted to: Boston College. Graduate School of Arts and Sciences.Discipline: Biology

    Programmiersprachen und Rechenkonzepte

    Get PDF
    Die GI-Fachgruppe 2.1.4 "Programmiersprachen und Rechenkonzepte" veranstaltete vom 3. bis 5. Mai 2004 im Physikzentrum Bad Honnef ihren jährlichen Workshop. Dieser Bericht enthält eine Zusammenstellung der Beiträge. Das Treffen diente wie in jedem Jahr gegenseitigem Kennenlernen, der Vertiefung gegenseitiger Kontakte, der Vorstellung neuer Arbeiten und Ergebnisse und vor allem der intensiven Diskussion. Ein breites Spektrum von Beiträgen, von theoretischen Grundlagen über Programmentwicklung, Sprachdesign, Softwaretechnik und Objektorientierung bis hin zur überraschend langen Geschichte der Rechenautomaten seit der Antike bildete ein interessantes und abwechlungsreiches Programm. Unter anderem waren imperative, funktionale und funktional-logische Sprachen, Software/Hardware-Codesign, Semantik, Web-Programmierung und Softwaretechnik, generative Programmierung, Aspekte und formale Testunterstützung Thema. Interessante Beiträge zu diesen und weiteren Themen gaben Anlaß zu Erfahrungsaustausch und Fachgesprächen auch mit den Teilnehmern des zeitgleich im Physikzentrum Bad Honnef stattfindenden Workshops "Reengineering". Allen Teilnehmern möchte ich dafür danken, daß sie mit ihren Vorträgen und konstruktiven Diskussionsbeiträgen zum Gelingen des Workshops beigetragen haben. Dank für die Vielfalt und Qualität der Beiträge gebührt den Autoren. Ein Wort des Dankes gebührt ebenso den Mitarbeitern und der Leitung des Physikzentrums Bad Honnef für die gewohnte angenehme und anregende Atmosphäre und umfassende Betreuung

    Graph clustering as a method to investigate riboswitch variation:

    Get PDF
    Thesis advisor: Michelle M. MeyerNon-coding RNA (ncRNA) perform vital functions in cells, but the impact of diversity across structure and function of homologous motifs has yet to be fully investigated. One reason for this is that the standard phylogenetic analysis used to address these questions in proteins cannot easily be applied to ncRNA due to their inherent characteristics. Compared to proteins, ncRNA have shorter sequence lengths, lower sequence conservation, and secondary structures that need to be incorporated into the analysis. This has necessitated an effort to develop methodology for investigating the evolutionary and functional relationship between sets of ncRNA. In this pursuit, I studied closely related riboswitches. Riboswitches are structured ncRNA found in bacterial mRNA that regulate gene expressions using their two major components: the aptamer and the expression platform. The aptamer of a riboswitch is able to bind a specific small molecule (ligand), and the bound/unbound state of the aptamer influences conformational changes in the expressions platform that can lead to increased or decreased downstream gene expression. Utilizing sequence and structural similarity metrics combined with graph clustering and de novo community detection algorithms I have determined a methodology for investigating the functional and evolutionary relationship between closely related riboswitches, and other ncRNA by extension, that are found across a range of diverse phyla.Thesis (PhD) — Boston College, 2021.Submitted to: Boston College. Graduate School of Arts and Sciences.Discipline: Biology
    • …
    corecore