Search CORE

2,387 research outputs found

Towards high performance computing for molecular structure prediction using IBM Cell Broadband Engine - an implementation perspective

Author: Beck
Behrens
Bianconi
Blasco
Blasco
Corma
Corma
Franke
Greegor
Grubert
Hagen
Herrmann
Iengo
Klaas
Liu
Maschmeyer
Maschmeyer
Waychunas
Wei
Publication venue: BioMed Central
Publication date: 01/01/2000
Field of study

Abstract Background RNA structure prediction problem is a computationally complex task, especially with pseudo-knots. The problem is well-studied in existing literature and predominantly uses highly coupled Dynamic Programming (DP) solutions. The problem scale and complexity become embarrassingly humungous to handle as sequence size increases. This makes the case for parallelization. Parallelization can be achieved by way of networked platforms (clusters, grids, etc) as well as using modern day multi-core chips. Methods In this paper, we exploit the parallelism capabilities of the IBM Cell Broadband Engine to parallelize an existing Dynamic Programming (DP) algorithm for RNA secondary structure prediction. We design three different implementation strategies that exploit the inherent data, code and/or hybrid parallelism, referred to as C-Par, D-Par and H-Par, and analyze their performances. Our approach attempts to introduce parallelism in critical sections of the algorithm. We ran our experiments on SONY Play Station 3 (PS3), which is based on the IBM Cell chip. Results Our results suggest that introducing parallelism in DP algorithm allows it to easily handle longer sequences which otherwise would consume a large amount of time in single core computers. The results further demonstrate the speed-up gain achieved in exploiting the inherent parallelism in the problem and also elicits the advantages of using multi-core platforms towards designing more sophisticated methodologies for handling a fairly long sequence of RNA. Conclusion The speed-up performance reported here is promising, especially when sequence length is long. To the best of our literature survey, the work reported in this paper is probably the first-of-its-kind to utilize the IBM Cell Broadband Engine (a heterogeneous multi-core chip) to implement a DP. The results also encourage using multi-core platforms towards designing more sophisticated methodologies for handling a fairly long sequence of RNA to predict its secondary structure.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Online Research Database In Technology

ScholarBank@NUS

Towards high performance computing for molecular structure prediction using IBM Cell Broadband Engine - an implementation perspective

Author: Krishnan SPT
Liang Sim Sze
Veeravalli Bharadwaj
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

ScholarBank@NUS

Polyhedral optimizations of RNA-RNA interaction computations

Author: Varadarajan Swetha
Publication venue: Colorado State University. Libraries
Publication date: 01/01/2017
Field of study

2017 Fall.Includes bibliographical references.Studying RNA-RNA interaction has led to major successes in the treatment of some cancers, including colon, breast and pancreatic cancer by suppressing the gene expression involved in the development of these diseases. The problem with such programs is that they are computationally and memory intensive: O(N4) space and O(N6) time complexity. Moreover, the entire application is complicated, and involves many mutually recursive data variables. We address the problem of speeding up a surrogate kernel (named OSPSQ) that captures the main dependence pattern found in two widely used RNA-RNA interaction applications IRIS and piRNA. The structure of the OSPSQ kernel perfectly fits the constraints of the polyhedral model, a well-developed technology for optimizing codes that belong to many specialized domains. However, the current state-of-the-art automatic polyhedral tools do not significantly improve the performance of the baseline implementation of OSPSQ. With simple techniques like loop permutation and skewing, we achieve an average of 17x sequential and 31x parallel speedup on a standard modern multi-core platform (Intel Broadwell, E5-1650v4). This performance represents 75% and 88% of attainable single-core and multi-core L1 bandwidth. For further performance improvement, we describe how to tile all six dimensions and also formulate the associated memory trade-off. In the future, we plan to implement these tiling strategies, explore the performance of the code for various tile sizes and optimize the whole piRNA application

Mountain Scholar (Digital Collections of Colorado and Wyoming)

A multiple layer model to compare RNA secondary structures

Author: Allali Julien
Sagot Marie-France
Publication venue: 'Royal College of Obstetricians & Gynaecologists (RCOG)'
Publication date: 01/01/2008
Field of study

International audienceWe formally introduce a new data structure, called MiGaL for ``Multiple Graph Layers'', that is composed of various graphs linked together by relations of abstraction/refinement. The new structure is useful for representing information that can be described at different levels of abstraction, each level corresponding to a graph. We then propose an algorithm for comparing two MiGaLs. The algorithm performs a step-by-step comparison starting with the most ``abstract'' level. The result of the comparison at a given step is communicated to the next step using a special colouring scheme. MiGaLs represent a very natural model for comparing RNA secondary structures that may be seen at different levels of detail, going from the sequence of nucleotides, single or paired with another to participate in a helix, to the network of multiple loops that is believed to represent the most conserved part of RNAs having similar function. We therefore show how to use MiGaLs to very efficiently compare two RNAs of any size at different levels of detail

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

Elastic properties of proteins: insight on the folding process and evolutionary selection of native structures

Author: Ala
Alm
Amadei
Amos Maritan
Anderson
Anderson
Atilgan
Bahar
Bahar
Bahar
Bahar
Bahar
Bahar
Bahar
Baker
Baldwin
Banavar
ben Avraham
Bernstein
Biroli
Bollobas
Cecconi
Chan
Chan
Chothia
Cieplak
Clementi
Condra
Cristian Micheletti
Debe
Denton
Diamond
Dinner
Doruker
Galzitskaya
Gianluca Lattanzi
Go
Gulnik
Haliloglu
Haliloglu
Halle
Hoang
Hoang
Horiuchi
Ivankov
Jackson
Jackson
Jacobs
Jernigan
Keskin
Klimov
Lazaridis
Levitt
Lustig
Maritan
McKay
Merris
Micheletti
Micheletti
Micheletti
Micheletti
Monasson
Munoz
Piana
Plaxco
Rader
Reddy
Settanni
Socci
Tirion
Tirion
Tisdale
Wlodawer
Publication venue
Publication date: 01/01/2002
Field of study

We carry out a theoretical study of the vibrational and relaxation properties of naturally-occurring proteins with the purpose of characterizing both the folding and equilibrium thermodynamics. By means of a suitable model we provide a full characterization of the spectrum and eigenmodes of vibration at various temperatures by merely exploiting the knowledge of the protein native structure. It is shown that the rate at which perturbations decay at the folding transition correlates well with experimental folding rates. This validation is carried out on a list of about 30 two-state folders. Furthermore, the qualitative analysis of residues mean square displacements (shown to accurately reproduce crystallographic data) provides a reliable and statistically accurate method to identify crucial folding sites/contacts. This novel strategy is validated against clinical data for HIV-1 Protease. Finally, we compare the spectra and eigenmodes of vibration of natural proteins against randomly-generated compact structures and regular random graphs. The comparison reveals a distinctive enhanced flexibility of natural structures accompanied by slow relaxation times at the folding temperature. The fact that these properties are intimately connected to the presence and assembly of secondary motifs hints at the special criteria adopted by evolution in the selection of viable folds.Comment: Revtex 17 pages, 13 eps figure

arXiv.org e-Print Archive

Crossref

HZB Repository

Archivio istituzionale della ricerca - Università di Bari

Sissa Digital Library

Archivio istituzionale della ricerca - Università di Padova

CPU-GPU hybrid accelerating the Zuker algorithm for RNA secondary structure prediction applications

Author: Dou Yong
Lei Guoqing
Li Rongchun
Ma Meng
Wan Wen
Xia Fei
Zou Dan
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

Springer - Publisher Connector

PubMed Central

Parallelization of dynamic programming recurrences in computational biology

Author: Jacob Arpith
Publication venue: Washington University Open Scholarship
Publication date: 01/01/2010
Field of study

The rapid growth of biosequence databases over the last decade has led to a performance bottleneck in the applications analyzing them. In particular, over the last five years DNA sequencing capacity of next-generation sequencers has been doubling every six months as costs have plummeted. The data produced by these sequencers is overwhelming traditional compute systems. We believe that in the future compute performance, not sequencing, will become the bottleneck in advancing genome science. In this work, we investigate novel computing platforms to accelerate dynamic programming algorithms, which are popular in bioinformatics workloads. We study algorithm-specific hardware architectures that exploit fine-grained parallelism in dynamic programming kernels using field-programmable gate arrays: FPGAs). We advocate a high-level synthesis approach, using the recurrence equation abstraction to represent dynamic programming and polyhedral analysis to exploit parallelism. We suggest a novel technique within the polyhedral model to optimize for throughput by pipelining independent computations on an array. This design technique improves on the state of the art, which builds latency-optimal arrays. We also suggest a method to dynamically switch between a family of designs using FPGA reconfiguration to achieve a significant performance boost. We have used polyhedral methods to parallelize the Nussinov RNA folding algorithm to build a family of accelerators that can trade resources for parallelism and are between 15-130x faster than a modern dual core CPU implementation. A Zuker RNA folding accelerator we built on a single workstation with four Xilinx Virtex 4 FPGAs outperforms 198 3 GHz Intel Core 2 Duo processors. Furthermore, our design running on a single FPGA is an order of magnitude faster than competing implementations on similar-generation FPGAs and graphics processors. Our work is a step toward the goal of automated synthesis of hardware accelerators for dynamic programming algorithms

Washington University St. Louis: Open Scholarship

RNAscClust:Clustering RNA sequences using structure conservation and graph based motifs

Author: Backofen Rolf
Costa Fabrizio
Gorodkin Jan
Havgaard Jakob Hull
Junge Alexander
Miladi Milad
Seemann Stefan E.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2017
Field of study

Copenhagen University Research Information System

New Advances in NGS Technologies

Author: D’Agaro Edo
Publication venue: 'IntechOpen'
Publication date: 01/01/2017
Field of study

In the next-generation sequencing (NGS) methods, a DNA molecule of an individual is broken down into many small fragments to make up the so-called sequencing library. These small fragments serve as a template for the synthesis of numerous complementary fragments (called reads). Every small piece of the original DNA is copied many times in a variable number of reads. Depending on the desired accuracy level, it is possible to set the system to achieve a certain level of coverage, i.e., a number of reads per fragment. A level of 30X coverage is already sufficient for the routine diagnosis of most of the Mendelian diseases. All the sequences are then transferred into a computer and aligned with a reference sequence available in the international databases. By this way, all sequences of reads can be recomposed as a fine puzzle to obtain the sequence of a single gene or whole genome. The NGS machines, available today, are very flexible devices. In fact, an NGS sequencer can be used for different types of applications: (1) whole-genome sequencing (WGS): analysis of the entire genome of an individual; (2) whole exome sequencing (WES): analysis of the entire coding genes of an individual; (3) targeted sequencing: analysis of a set of genes or a single gene; (4) transcriptome analysis: analysis of all the RNA produced by specific cells

IntechOpen

Archivio istituzionale della ricerca - Università degli Studi di Udine

Looking at the nudibranch family myrrhinidae (Gastropoda, heterobranchia) from a mitochondrial ‘2d folding structure’ point of view

Author: Furfaro G.
Mariottini P.
Publication venue: 'MDPI AG'
Publication date: 01/01/2021
Field of study

Integrative taxonomy is an evolving field of multidisciplinary studies often utilised to elucidate phylogenetic reconstructions that were poorly understood in the past. The systematics of many taxa have been resolved by combining data from different research approaches, i.e., molecular, ecological, behavioural, morphological and chemical. Regarding molecular analysis, there is currently a search for new genetic markers that could be diagnostic at different taxonomic levels and that can be added to the canonical ones. In marine Heterobranchia, the most widely used mitochondrial markers, COI and 16S, are usually analysed by comparing the primary sequence. The 16S rRNA molecule can be folded into a 2D secondary structure that has been poorly exploited in the past study of heterobranchs, despite 2D molecular analyses being sources of possible diagnostic characters. Comparison of the results from the phylogenetic analyses of a concatenated (the nuclear H3 and the mitochondrial COI and 16S markers) dataset (including 30 species belonging to eight accepted genera) and from the 2D folding structure analyses of the 16S rRNA from the type species of the genera investigated demonstrated the diagnostic power of this RNA molecule to reveal the systematics of four genera belonging to the family Myrrhinidae (Gastropoda, Heterobranchia). The “molecular morphological” approach to the 16S rRNA revealed to be a powerful tool to delimit at both species and genus taxonomic levels and to be a useful way of recovering information that is usually lost in phylogenetic analyses. While the validity of the genera Godiva, Hermissenda and Phyllodesmium are confirmed, a new genus is necessary and introduced for Dondice banyulensis, Nemesis gen. nov. and the monospecific genus Nanuca is here synonymised with Dondice, with Nanuca sebastiani transferred into Dondice as Dondice sebastiani comb. nov

Directory of Open Access Journals

PubMed Central

Archivio Istituzionale della Ricerca- Università del Salento