Search CORE

1,185 research outputs found

Geometric combinatorics and computational molecular biology: branching polytopes for RNA sequences

Author: Drellich Elizabeth
Gainer-Dewar Andrew
Harrington Heather A.
He Qijun
Heitsch Christine
Poznanović Svetlana
Publication venue
Publication date: 16/06/2016
Field of study

Questions in computational molecular biology generate various discrete optimization problems, such as DNA sequence alignment and RNA secondary structure prediction. However, the optimal solutions are fundamentally dependent on the parameters used in the objective functions. The goal of a parametric analysis is to elucidate such dependencies, especially as they pertain to the accuracy and robustness of the optimal solutions. Techniques from geometric combinatorics, including polytopes and their normal fans, have been used previously to give parametric analyses of simple models for DNA sequence alignment and RNA branching configurations. Here, we present a new computational framework, and proof-of-principle results, which give the first complete parametric analysis of the branching portion of the nearest neighbor thermodynamic model for secondary structure prediction for real RNA sequences.Comment: 17 pages, 8 figure

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

Louse (Insecta : Phthiraptera) mitochondrial 12S rRNA secondary structure is highly variable

Author: Billoud B.
Collins L.J.
Corpet F.
Critchlow D.E.
Day W.H.E.
Fontana W.
Gutell R.R.
Hafner M.S.
Hafner M.S.
Hickson R.E.
Hickson R.E.
Hofacker I.L.
Houde P.
Johnson K.P.
Johnson K.P.
K. P. Johnson
Konings D.A.M.
Lenhof H.-P.
Lockhart P.J.
Mindell D.P.
Moran N.A.
Page R.D.M.
Page R.D.M.
Page R.D.M.
R. Cruickshank
R. D. M. Page
Shao R.
Simon C.
Springer M.S.
Stoye J.
Swofford D.L.
Wheeler W.C.
Publication venue: 'Wiley'
Publication date: 01/01/2002
Field of study

Lice are ectoparasitic insects hosted by birds and mammals. Mitochondrial 12S rRNA sequences obtained from lice show considerable length variation and are very difficult to align. We show that the louse 12S rRNA domain III secondary structure displays considerable variation compared to other insects, in both the shape and number of stems and loops. Phylogenetic trees constructed from tree edit distances between louse 12S rRNA structures do not closely resemble trees constructed from sequence data, suggesting that at least some of this structural variation has arisen independently in different louse lineages. Taken together with previous work on mitochondrial gene order and elevated rates of substitution in louse mitochondrial sequences, the structural variation in louse 12S rRNA confirms the highly distinctive nature of molecular evolution in these insects

CiteSeerX

Crossref

Enlighten

Parallelization of dynamic programming recurrences in computational biology

Author: Jacob Arpith
Publication venue: Washington University Open Scholarship
Publication date: 01/01/2010
Field of study

The rapid growth of biosequence databases over the last decade has led to a performance bottleneck in the applications analyzing them. In particular, over the last five years DNA sequencing capacity of next-generation sequencers has been doubling every six months as costs have plummeted. The data produced by these sequencers is overwhelming traditional compute systems. We believe that in the future compute performance, not sequencing, will become the bottleneck in advancing genome science. In this work, we investigate novel computing platforms to accelerate dynamic programming algorithms, which are popular in bioinformatics workloads. We study algorithm-specific hardware architectures that exploit fine-grained parallelism in dynamic programming kernels using field-programmable gate arrays: FPGAs). We advocate a high-level synthesis approach, using the recurrence equation abstraction to represent dynamic programming and polyhedral analysis to exploit parallelism. We suggest a novel technique within the polyhedral model to optimize for throughput by pipelining independent computations on an array. This design technique improves on the state of the art, which builds latency-optimal arrays. We also suggest a method to dynamically switch between a family of designs using FPGA reconfiguration to achieve a significant performance boost. We have used polyhedral methods to parallelize the Nussinov RNA folding algorithm to build a family of accelerators that can trade resources for parallelism and are between 15-130x faster than a modern dual core CPU implementation. A Zuker RNA folding accelerator we built on a single workstation with four Xilinx Virtex 4 FPGAs outperforms 198 3 GHz Intel Core 2 Duo processors. Furthermore, our design running on a single FPGA is an order of magnitude faster than competing implementations on similar-generation FPGAs and graphics processors. Our work is a step toward the goal of automated synthesis of hardware accelerators for dynamic programming algorithms

Washington University St. Louis: Open Scholarship

The Mathematics of Phylogenomics

Author: Pachter Lior
Sturmfels Bernd
Publication venue
Publication date: 01/01/2004
Field of study

The grand challenges in biology today are being shaped by powerful high-throughput technologies that have revealed the genomes of many organisms, global expression patterns of genes and detailed information about variation within populations. We are therefore able to ask, for the first time, fundamental questions about the evolution of genomes, the structure of genes and their regulation, and the connections between genotypes and phenotypes of individuals. The answers to these questions are all predicated on progress in a variety of computational, statistical, and mathematical fields. The rapid growth in the characterization of genomes has led to the advancement of a new discipline called Phylogenomics. This discipline results from the combination of two major fields in the life sciences: Genomics, i.e., the study of the function and structure of genes and genomes; and Molecular Phylogenetics, i.e., the study of the hierarchical evolutionary relationships among organisms and their genomes. The objective of this article is to offer mathematicians a first introduction to this emerging field, and to discuss specific mathematical problems and developments arising from phylogenomics.Comment: 41 pages, 4 figure

arXiv.org e-Print Archive

CiteSeerX

Caltech Authors

An exact mathematical programming approach to multiple RNA sequence-structure alignment

Author: Bauer Markus
Klau Gunnar W.
Reinert Knut
Publication venue
Publication date: 01/01/2007
Field of study

One of the main tasks in computational biology is the computation of alignments of genomic sequences to reveal their commonalities. In case of DNA or protein sequences, sequence information alone is usually sufficient to compute reliable alignments. RNA molecules, however, build spatial conformations—the secondary structure—that are more conserved than the actual sequence. Hence, computing reliable alignments of RNA molecules has to take into account the secondary structure. We present a novel framework for the computation of exact multiple sequence-structure alignments: We give a graph- theoretic representation of the sequence-structure alignment problem and phrase it as an integer linear program. We identify a class of constraints that make the problem easier to solve and relax the original integer linear program in a Lagrangian manner. Experiments on a recently published benchmark show that our algorithms has a comparable performance than more costly dynamic programming algorithms, and outperforms all other approaches in terms of solution quality with an increasing number of input sequences

University of New Brunswick: Centre for Digital Scholarship Journals

Institutional Repository of the Freie Universität Berlin

CiteSeerX

CWI's Institutional Repository

Repository: Freie Universität Berlin (FU), Math Department (fu_mi_publications)

Algorithm engineering for optimal alignment of protein structure distance matrices

Author: A. Andreeva
A. Caprara
A. Marin
A. Schrijver
C. Berbalk
D. Wu
D.A. Pelta
E. Althaus
G. Mayr
Gunnar W. Klau
H. Hasegawa
H.P. Lenhof
I. Wohlers
Inken Wohlers
L. Holm
N. Malod-Dognin
P. Di Lena
R. Andonov
R. Kolodny
R.H. Lathrop
Rumen Andonov
T. Havel
T. Kawabata
W. Xie
W.R. Taylor
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Protein structural alignment is an important problem in computational biology. In this paper, we present first successes on provably optimal pairwise alignment of protein inter-residue distance matrices, using the popular Dali scoring function. We introduce the structural alignment problem formally, which enables us to express a variety of scoring functions used in previous work as special cases in a unified framework. Further, we propose the first mathematical model for computing optimal structural alignments based on dense inter-residue distance matrices. We therefore reformulate the problem as a special graph problem and give a tight integer linear programming model. We then present algorithm engineering techniques to handle the huge integer linear programs of real-life distance matrix alignment problems. Applying these techniques, we can compute provably optimal Dali alignments for the very first time

arXiv.org e-Print Archive

HAL-CentraleSupelec

CiteSeerX

Crossref

CWI's Institutional Repository

INRIA a CCSD electronic archive server

HAL-Rennes 1

Asymmetric Genome Organization in an RNA Virus Revealed via Graph-Theoretical Analysis of Tomographic Data

Author: A Borodavka
A Pickl-Herk
AJ Fisher
B Böttcher
Claus O. Wilke
DH Bunka
DLD Caspar
DY Kim
EC Dykeman
EC Dykeman
EC Dykeman
EC Dykeman
EC Dykeman
EF Pettersen
Eric C. Dykeman
EV Orlova
F Golmohammadi
F Qu
FHC Crick
GA Bentley
GD Pintilie
HC Levy
J Ren
J Seitsonen
JA Speir
James A. Geraets
JE Johnson
JM Fox
K Toropova
K Toropova
K Valegård
KC Dent
L Tang
M Bostina
MC Morais
NA Ranson
Neil A. Ranson
P Ni
Peter G. Stockley
PG Stockley
PG Stockley
PG Stockley
R Koning
Reidun Twarock
RJ Ford
S Hafenstein
SB Larson
SB Larson
SB Larson
SB Larson
SE Bakker
SHE van den Worm
SJ Schroeder
SW Lane
T Lin
T Shiba
TJ Tuthill
Y Zeng
Z Chen
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/03/2015
Field of study

Cryo-electron microscopy permits 3-D structures of viral pathogens to be determined in remarkable detail. In particular, the protein containers encapsulating viral genomes have been determined to high resolution using symmetry averaging techniques that exploit the icosahedral architecture seen in many viruses. By contrast, structure determination of asymmetric components remains a challenge, and novel analysis methods are required to reveal such features and characterize their functional roles during infection. Motivated by the important, cooperative roles of viral genomes in the assembly of single-stranded RNA viruses, we have developed a new analysis method that reveals the asymmetric structural organization of viral genomes in proximity to the capsid in such viruses. The method uses geometric constraints on genome organization, formulated based on knowledge of icosahedrally-averaged reconstructions and the roles of the RNA-capsid protein contacts, to analyse cryo-electron tomographic data. We apply this method to the low-resolution tomographic data of a model virus and infer the unique asymmetric organization of its genome in contact with the protein shell of the capsid. This opens unprecedented opportunities to analyse viral genomes, revealing conserved structural features and mechanisms that can be targeted in antiviral drug desig

Crossref

Directory of Open Access Journals

PubMed Central

White Rose Research Online

FigShare

Parametric Analysis of RNA Branching Configurations

Author: Christine E. Heitsch
Valerie Hower
Publication venue: Springer Nature
Publication date: 01/01/2011
Field of study

Springer - Publisher Connector