2 research outputs found
Grid and high performance computing applied to bioinformatics
Recent advances in genome sequencing technologies and modern biological data
analysis technologies used in bioinformatics have led to a fast and continuous increase
in biological data. The difficulty of managing the huge amounts of data currently
available to researchers and the need to have results within a reasonable time have
led to the use of distributed and parallel computing infrastructures for their analysis.
In this context Grid computing has been successfully used. Grid computing is based
on a distributed system which interconnects several computers and/or clusters to
access global-scale resources. This infrastructure is
exible, highly scalable and can
achieve high performances with data-compute-intensive algorithms.
Recently, bioinformatics is exploring new approaches based on the use of hardware
accelerators, such as the Graphics Processing Units (GPUs). Initially developed as
graphics cards, GPUs have been recently introduced for scientific purposes by rea-
son of their performance per watt and the better cost/performance ratio achieved in
terms of throughput and response time compared to other high-performance com-
puting solutions.
Although developers must have an in-depth knowledge of GPU programming and
hardware to be effective, GPU accelerators have produced a lot of impressive results.
The use of high-performance computing infrastructures raises the question of finding
a way to parallelize the algorithms while limiting data dependency issues in order
to accelerate computations on a massively parallel hardware.
In this context, the research activity in this dissertation focused on the assessment
and testing of the impact of these innovative high-performance computing technolo-
gies on computational biology. In order to achieve high levels of parallelism and, in
the final analysis, obtain high performances, some of the bioinformatic algorithms
applicable to genome data analysis were selected, analyzed and implemented. These
algorithms have been highly parallelized and optimized, thus maximizing the GPU
hardware resources. The overall results show that the proposed parallel algorithms
are highly performant, thus justifying the use of such technology.
However, a software infrastructure for work
ow management has been devised to
provide support in CPU and GPU computation on a distributed GPU-based in-
frastructure. Moreover, this software infrastructure allows a further coarse-grained
data-parallel parallelization on more GPUs. Results show that the proposed appli-
cation speed-up increases with the increase in the number of GPUs
Grid and high performance computing applied to bioinformatics
Recent advances in genome sequencing technologies and modern biological data
analysis technologies used in bioinformatics have led to a fast and continuous increase
in biological data. The difficulty of managing the huge amounts of data currently
available to researchers and the need to have results within a reasonable time have
led to the use of distributed and parallel computing infrastructures for their analysis.
In this context Grid computing has been successfully used. Grid computing is based
on a distributed system which interconnects several computers and/or clusters to
access global-scale resources. This infrastructure is
exible, highly scalable and can
achieve high performances with data-compute-intensive algorithms.
Recently, bioinformatics is exploring new approaches based on the use of hardware
accelerators, such as the Graphics Processing Units (GPUs). Initially developed as
graphics cards, GPUs have been recently introduced for scientific purposes by rea-
son of their performance per watt and the better cost/performance ratio achieved in
terms of throughput and response time compared to other high-performance com-
puting solutions.
Although developers must have an in-depth knowledge of GPU programming and
hardware to be effective, GPU accelerators have produced a lot of impressive results.
The use of high-performance computing infrastructures raises the question of finding
a way to parallelize the algorithms while limiting data dependency issues in order
to accelerate computations on a massively parallel hardware.
In this context, the research activity in this dissertation focused on the assessment
and testing of the impact of these innovative high-performance computing technolo-
gies on computational biology. In order to achieve high levels of parallelism and, in
the final analysis, obtain high performances, some of the bioinformatic algorithms
applicable to genome data analysis were selected, analyzed and implemented. These
algorithms have been highly parallelized and optimized, thus maximizing the GPU
hardware resources. The overall results show that the proposed parallel algorithms
are highly performant, thus justifying the use of such technology.
However, a software infrastructure for work
ow management has been devised to
provide support in CPU and GPU computation on a distributed GPU-based in-
frastructure. Moreover, this software infrastructure allows a further coarse-grained
data-parallel parallelization on more GPUs. Results show that the proposed appli-
cation speed-up increases with the increase in the number of GPUs