374 research outputs found
Comparison of Data Partitioning Schema of Parallel Pairwise Alignment on Shared Memory System
The pairwise alignment (PA) algorithm is widely used in bioinformatics to analyze biological sequence. With the advance of sequencer technology, a massive amount of DNA fragments are sequenced much quicker and cheaper. The alignment algorithm needs to be parallelized to be able to align them in a shorter time. Many previous researches have parallelize PA algorithm using various data partitioning schema, but it is unclear which one is the best. The data partitioning schema is important for parallel PA performance, because this algorithm use dynamic programming technique that needs intense inter-thread communication. In this paper, we compared four partitioning schemas to find the best performing one on shared memory system. Those schemas are: blocked columnwise, rowwise, antidiagonal, and blocked columnwise with manual scheduling and loop unrolling. The last schema gave the best performance of 89% efficiency on 4 threads. This result provided fine-grain parallelism that can be used further to develop parallel multiple sequence alignment (MSA)
Dynamic Multigrain Parallelization on the Cell Broadband Engine
This paper addresses the problem of orchestrating and scheduling
parallelism at multiple levels of granularity on heterogeneous
multicore processors. We present policies and mechanisms for adaptive
exploitation and scheduling of multiple layers of parallelism on the
Cell Broadband Engine. Our policies combine event-driven task
scheduling with malleable loop-level parallelism, which is exposed
from the runtime system whenever task-level parallelism leaves cores
idle. We present a runtime system for scheduling applications with
layered parallelism on Cell and investigate its potential with RAxML,
a computational biology application which infers large phylogenetic
trees, using the Maximum Likelihood (ML) method. Our experiments show
that the Cell benefits significantly from dynamic parallelization
methods, that selectively exploit the layers of parallelism in the
system, in response to workload characteristics. Our runtime
environment outperforms naive parallelization and scheduling based on
MPI and Linux by up to a factor of 2.6. We are able to execute RAxML
on one Cell four times faster than on a dual-processor system with
Hyperthreaded Xeon processors, and 5--10\% faster than on a
single-processor system with a dual-core, quad-thread IBM Power5
processor
Exploring New Search Algorithms and Hardware for Phylogenetics: RAxML Meets the IBM Cell
Phylogenetic inference is considered to be one of the grand challenges in Bioinformatics due to the immense computational requirements. RAxML is currently among the fastest and most accurate programs for phylogenetic tree inference under the Maximum Likelihood (ML) criterion. First, we introduce new tree search heuristics that accelerate RAxML by a factor of 2.43 while returning equally good trees. The performance of the new search algorithm has been assessed on 18 real-world datasets comprising 148 up to 4,843 DNA sequences. We then present the implementation, optimization, and evaluation of RAxML on the IBM Cell Broadband Engine. We address the problems and provide solutions pertaining to the optimization of floating point code, control flow, communication, and scheduling of multi-level parallelism on the Cel
RAxML-Cell: Parallel Phylogenetic Tree Inference on the Cell Broadband Engine
Phylogenetic tree reconstruction is one of the grand challenge
problems in Bioinformatics. The search for a best-scoring tree with 50
organisms, under a reasonable optimality criterion, creates a
topological search space which is as large as the number of atoms in
the universe. Computational phylogeny is challenging even for the most
powerful supercomputers. It is also an ideal candidate for
benchmarking emerging multiprocessor architectures, because it
exhibits various levels of fine and coarse-grain parallelism. In this
paper, we present the porting, optimization, and evaluation of RAxML
on the Cell Broadband Engine. RAxML is a provably efficient, hill
climbing algorithm for computing phylogenetic trees based on the
Maximum Likelihood (ML) method. The algorithm uses an embarrassingly
parallel search method, which also exhibits data-level parallelism and
control parallelism in the computation of the likelihood functions.
We present the optimization of one of the currently fastest tree
search algorithms, on a real Cell blade prototype. We also
investigate problems and present solutions pertaining to the
optimization of floating point code, control flow, communication,
scheduling, and multi-level parallelization on the Cell
Concurrent and Accurate RNA Sequencing on Multicore Platforms
In this paper we introduce a novel parallel pipeline for fast and accurate
mapping of RNA sequences on servers equipped with multicore processors. Our
software, named HPG-Aligner, leverages the speed of the Burrows-Wheeler
Transform to map a large number of RNA fragments (reads) rapidly, as well as
the accuracy of the Smith-Waterman algorithm, that is employed to deal with
conflictive reads. The aligner is complemented with a careful strategy to
detect splice junctions based on the division of RNA reads into short segments
(or seeds), which are then mapped onto a number of candidate alignment
locations, providing useful information for the successful alignment of the
complete reads.
Experimental results on platforms with AMD and Intel multicore processors
report the remarkable parallel performance of HPG-Aligner, on short and long
RNA reads, which excels in both execution time and sensitivity to an
state-of-the-art aligner such as TopHat 2 built on top of Bowtie and Bowtie 2
- …