536 research outputs found
High Performance Biological Pairwise Sequence Alignment: FPGA versus GPU versus Cell BE versus GPP
This paper explores the pros and cons of reconfigurable computing in the form of FPGAs for high performance efficient computing. In particular, the paper presents the results of a comparative study between three different acceleration technologies, namely, Field Programmable Gate Arrays (FPGAs), Graphics Processor Units (GPUs), and IBMâs Cell Broadband Engine (Cell BE), in the design and implementation of the widely-used Smith-Waterman pairwise sequence alignment algorithm, with general purpose processors as a base reference implementation. Comparison criteria include speed, energy consumption, and purchase and development costs. The study shows that FPGAs largely outperform all other implementation platforms on performance per watt criterion and perform better than all other platforms on performance per dollar criterion, although by a much smaller margin. Cell BE and GPU come second and third, respectively, on both performance per watt and performance per dollar criteria. In general, in order to outperform other technologies on performance per dollar criterion (using currently available hardware and development tools), FPGAs need to achieve at least two orders of magnitude speed-up compared to general-purpose processors and one order of magnitude speed-up compared to domain-specific technologies such as GPUs
Protein alignment HW/SW optimizations
Biosequence alignment recently received an amazing support from both commodity and dedicated hardware platforms. The limitless requirements of this application motivate the search for improved implementations to boost processing time and capabilities. We propose an unprecedented hardware improvement to the classic Smith-Waterman (S-W) algorithm based on a twofold approach: i) an on-the-fly gap-open/gap-extension selection that reduces the hardware implementation complexity; ii) a pre-selection filter that uses reduced amino-acid alphabets to screen out not-significant sequences and to shorten the S-Witerations on huge reference databases.We demonstrated the improvements w.r.t. a classic approach both from the point of view of algorithm efficiency and of HW performance (FPGA and ASIC post-synthesis analysis)
Multiple Biolgical Sequence Alignment: Scoring Functions, Algorithms, and Evaluations
Aligning multiple biological sequences such as protein sequences or DNA/RNA sequences is a fundamental task in bioinformatics and sequence analysis. These alignments may contain invaluable information that scientists need to predict the sequences\u27 structures, determine the evolutionary relationships between them, or discover drug-like compounds that can bind to the sequences. Unfortunately, multiple sequence alignment (MSA) is NP-Complete. In addition, the lack of a reliable scoring method makes it very hard to align the sequences reliably and to evaluate the alignment outcomes.
In this dissertation, we have designed a new scoring method for use in multiple sequence alignment. Our scoring method encapsulates stereo-chemical properties of sequence residues and their substitution probabilities into a tree-structure scoring scheme. This new technique provides a reliable scoring scheme with low computational complexity.
In addition to the new scoring scheme, we have designed an overlapping sequence clustering algorithm to use in our new three multiple sequence alignment algorithms. One of our alignment algorithms uses a dynamic weighted guidance tree to perform multiple sequence alignment in progressive fashion. The use of dynamic weighted tree allows errors in the early alignment stages to be corrected in the subsequence stages. Other two algorithms utilize sequence knowledge-bases and sequence consistency to produce biological meaningful sequence alignments. To improve the speed of the multiple sequence alignment, we have developed a parallel algorithm that can be deployed on reconfigurable computer models. Analytically, our parallel algorithm is the fastest progressive multiple sequence alignment algorithm
Parallel progressive multiple sequence alignment on reconfigurable meshes
<p>Abstract</p> <p>Background</p> <p>One of the most fundamental and challenging tasks in bio-informatics is to identify related sequences and their hidden biological significance. The most popular and proven best practice method to accomplish this task is aligning multiple sequences together. However, multiple sequence alignment is a computing extensive task. In addition, the advancement in DNA/RNA and Protein sequencing techniques has created a vast amount of sequences to be analyzed that exceeding the capability of traditional computing models. Therefore, an effective parallel multiple sequence alignment model capable of resolving these issues is in a great demand.</p> <p>Results</p> <p>We design <it>O</it>(1) run-time solutions for both local and global dynamic programming pair-wise alignment algorithms on reconfigurable mesh computing model. To align <it>m </it>sequences with max length <it>n</it>, we combining the parallel pair-wise dynamic programming solutions with newly designed parallel components. We successfully reduce the progressive multiple sequence alignment algorithm's run-time complexity from <it>O</it>(<it>m </it>Ă <it>n</it><sup>4</sup>) to <it>O</it>(<it>m</it>) using <it>O</it>(<it>m </it>Ă <it>n</it><sup>3</sup>) processing units for scoring schemes that use three distinct values for match/mismatch/gap-extension. The general solution to multiple sequence alignment algorithm takes <it>O</it>(<it>m </it>Ă <it>n</it><sup>4</sup>) processing units and completes in <it>O</it>(<it>m</it>) time.</p> <p>Conclusions</p> <p>To our knowledge, this is the first time the progressive multiple sequence alignment algorithm is completely parallelized with <it>O</it>(<it>m</it>) run-time. We also provide a new parallel algorithm for the Longest Common Subsequence (LCS) with <it>O</it>(1) run-time using <it>O</it>(<it>n</it><sup>3</sup>) processing units. This is a big improvement over the current best constant-time algorithm that uses <it>O</it>(<it>n</it><sup>4</sup>) processing units.</p
- âŠ