63 research outputs found
Computing Platforms for Big Biological Data Analytics: Perspectives and Challenges.
The last decade has witnessed an explosion in the amount of available biological sequence data, due to the rapid progress of high-throughput sequencing projects. However, the biological data amount is becoming so great that traditional data analysis platforms and methods can no longer meet the need to rapidly perform data analysis tasks in life sciences. As a result, both biologists and computer scientists are facing the challenge of gaining a profound insight into the deepest biological functions from big biological data. This in turn requires massive computational resources. Therefore, high performance computing (HPC) platforms are highly needed as well as efficient and scalable algorithms that can take advantage of these platforms. In this paper, we survey the state-of-the-art HPC platforms for big biological data analytics. We first list the characteristics of big biological data and popular computing platforms. Then we provide a taxonomy of different biological data analysis applications and a survey of the way they have been mapped onto various computing platforms. After that, we present a case study to compare the efficiency of different computing platforms for handling the classical biological sequence alignment problem. At last we discuss the open issues in big biological data analytics
Reconfigurable acceleration of genetic sequence alignment: A survey of two decades of efforts
Genetic sequence alignment has always been a computational challenge in bioinformatics. Depending on the problem size, software-based aligners can take multiple CPU-days to process the sequence data, creating a bottleneck point in bioinformatic analysis flow. Reconfigurable accelerator can achieve high performance for such computation by providing massive parallelism, but at the expense of programming flexibility and thus has not been commensurately used by practitioners. Therefore, this paper aims to provide a thorough survey of the proposed accelerators by giving a qualitative categorization based on their algorithms and speedup. A comprehensive comparison between work is also presented so as to guide selection for biologist, and to provide insight on future research direction for FPGA scientists
FPGA acceleration of DNA sequence alignment: design analysis and optimization
Existing FPGA accelerators for short read mapping often fail to utilize the complete biological information in sequencing data for simple hardware design, leading to missed or incorrect alignment. In this work, we propose a runtime reconfigurable alignment pipeline that considers all information in sequencing data for the biologically accurate acceleration of short read mapping. We focus our efforts on accelerating two string matching techniques: FM-index and the Smith-Waterman algorithm with the affine-gap model which are commonly used in short read mapping. We further optimize the FPGA hardware using a design analyzer and merger to improve alignment performance. The contributions of this work are as follows.
1. We accelerate the exact-match and mismatch alignment by leveraging the FM-index technique. We optimize memory access by compressing the data structure and interleaving the access with multiple short reads. The FM-index hardware also considers complete information in the read data to maximize accuracy.
2. We propose a seed-and-extend model to accelerate alignment with indels. The FM-index hardware is extended to support the seeding stage while a Smith-Waterman implementation with the affine-gap model is developed on FPGA for the extension stage. This model can improve the efficiency of indel alignment with comparable accuracy versus state-of-the-art software.
3. We present an approach for merging multiple FPGA designs into a single hardware design, so that multiple place-and-route tasks can be replaced by a single task to speed up functional evaluation of designs. We first experiment with this approach to demonstrate its feasibility for different designs. Then we apply this approach to optimize one of the proposed FPGA aligners for better alignment performance.Open Acces
Algorithm-Hardware Co-Design for Performance-driven Embedded Genomics
PhD ThesisGenomics includes development of techniques for diagnosis, prognosis and therapy of
over 6000 known genetic disorders. It is a major driver in the transformation of medicine
from the reactive form to the personalized, predictive, preventive and participatory (P4)
form. The availability of genome is an essential prerequisite to genomics and is obtained
from the sequencing and analysis pipelines of the whole genome sequencing (WGS).
The advent of second generation sequencing (SGS), significantly, reduced the sequencing
costs leading to voluminous research in genomics. SGS technologies, however, generate
massive volumes of data in the form of reads, which are fragmentations of the real
genome. The performance requirements associated with mapping reads to the reference
genome (RG), in order to reassemble the original genome, now, stands disproportionate
to the available computational capabilities. Conventionally, the hardware resources used
are made of homogeneous many-core architecture employing complex general-purpose
CPU cores. Although these cores provide high-performance, a data-centric approach
is required to identify alternate hardware systems more suitable for affordable and
sustainable genome analysis.
Most state-of-the-art genomic tools are performance oriented and do not address
the crucial aspect of energy consumption. Although algorithmic innovations have
reduced runtime on conventional hardware, the energy consumption has scaled poorly.
The associated monetary and environmental costs have made it a major bottleneck to
translational genomics. This thesis is concerned with the development and validation
of read mappers for embedded genomics paradigm, aiming to provide a portable and
energy-efficient hardware solution to the reassembly pipeline. It applies the algorithmhardware co-design approach to bridge the saturation point arrived in algorithmic
innovations with emerging low-power/energy heterogeneous embedded platforms.
Essential to embedded paradigm is the ability to use heterogeneous hardware
resources. Graphical processing units (GPU) are, often, available in most modern devices
alongside CPU but, conventionally, state-of-the-art read mappers are not tuned to use
both together. The first part of the thesis develops a Cross-platfOrm Read mApper
using opencL (CORAL) that can distribute workload on all available devices for high
performance. OpenCL framework mitigates the need for designing separate kernels for
CPU and GPU. It implements a verification-aware filtration algorithm for rapid pruning
and identification of candidate locations for mapping reads to the RG.
Mapping reads on embedded platforms decreases performance due to architectural
differences such as limited on-chip/off-chip memory, smaller bandwidths and simpler
cores. To mitigate performance degradation, in second part of the thesis, we propose a
REad maPper for heterogeneoUs sysTEms (REPUTE) which uses an efficient dynamic
programming (DP) based filtration methodology. Using algorithm-hardware co-design
and kernel level optimizations to reduce its memory footprint, REPUTE demonstrated
significant energy savings on HiKey970 embedded platform with acceptable performance.
The third part of the thesis concentrates on mapping the whole genome on an
embedded platform. We propose a Pyopencl based tooL for gEnomic workloaDs
tarGeting Embedded platfoRms (PLEDGER) which includes two novel contributions.
The first one proposes a novel preprocessing strategy to generate low-memory footprint
(LMF) data structure to fit all human chromosomes at the cost of performance. Second
contribution is LMF DP-based filtration method to work in conjunction with the
proposed data structures. To mitigate performance degradation, the kernel employs
several optimisations including extensive usage of bit-vector operations. Extensive
experiments using real human reads were carried out with state-of-the-art read mappers
on 5 different platforms for CORAL, REPUTE and PLEDGER. The results show that
embedded genomics provides significant energy savings with similar performance
compared to conventional CPU-based platforms
Comparative Analysis of Computationally Accelerated NGS Alignment
The Smith-Waterman algorithm is the basis of most current sequence alignment technology, which can be used to identify similarities between sequences for cancer detection and treatment because it provides researchers with potential targets for early diagnosis and personalized treatment. The growing number of DNA and RNA sequences available to analyze necessitates faster alignment processes than are possible with current iterations of the Smith-Waterman (S-W) algorithm. This project aimed to identify the most effective and efficient methods for accelerating the S-W algorithm by investigating recent advances in sequence alignment. Out of a total of 22 articles considered in this project, 17 articles had to be excluded from the study due to lack of standardization of data reporting. Only one study by Chen et al. obtained in this project contained enough information to compare accuracy and alignment speed. When accuracy was excluded from the criteria, five studies contained enough information to rank their efficiency. The study conducted by Rucci et al. was the fastest at 268.83 Giga Cell Updates Per Second (GCUPS), and the method by PĂ©rez-Serrano et al. came close at 229.93 GCUPS while testing larger sequences. It was determined that reporting standards in this field are not sufficient, and the study by Chen et al. should set a benchmark for future reporting
Genomic co-processor for long read assembly
Genomics data is transforming medicine and our understanding of life in fundamental ways; however, it is far outpacing Moore's Law. Third-generation sequencing technologies produce 100X longer reads than second generation technologies and reveal a much broader mutation spectrum of disease and evolution. However, these technologies incur prohibitively high computational costs. In order to enable the vast potential of exponentially growing genomics data, domain specific acceleration provides one of the few remaining approaches to continue to scale compute performance and efficiency, since general-purpose architectures are struggling to handle the huge amount of data needed for genome alignment. The aim of this project is to implement a genomic-coprocessor targeting HPC FPGAs starting from the Darwin FPGA co-processor. In this scenario, the final objective is the simulation and implementation of the algorithms described by Darwin using Alveo boards, exploiting High Bandwidth Memory (HBM) to increase its performance
FPGA Acceleration of Pre-Alignment Filters for Short Read Mapping With HLS
Pre-alignment filters are useful for reducing the computational requirements of genomic sequence mappers. Most of them are based on estimating or computing the edit distance between sequences and their candidate locations in a reference genome using a subset of the dynamic programming table used to compute Levenshtein distance. Some of their FPGA implementations of use classic HDL toolchains, thus limiting their portability. Currently, most FPGA accelerators offered by heterogeneous cloud providers support C/C++ HLS. In this work, we implement and optimize several state-of-the-art pre-alignment filters using C/C++ based-HLS to expand their portability to a wide range of systems supporting the OpenCL runtime. Moreover, we perform a complete analysis of the performance and accuracy of the filters and analyze the implications of the results. The maximum throughput obtained by an exact filter is 95.1 MPairs/s including memory transfers using 100 bp sequences, which is the highest ever reported for a comparable system and more than two times faster than previous HDL-based results. The best energy efficiency obtained from the accelerator (not considering host CPU) is 2.1 MPairs/J, more than one order of magnitude higher than other accelerator-based comparable approaches from the state of the art.10.13039/501100008530-European Union Regional Development Fund (ERDF) within the framework of the ERDF Operational Program of Catalonia 2014-2020 with a grant of 50% of the total cost eligible under the Designing RISC-V based Accelerators for next generation computers project (DRAC) (Grant Number: [001-P-001723])
10.13039/501100002809-Catalan Government (Grant Number: 2017-SGR-313 and 2017-SGR-1624)
10.13039/501100004837-Spanish Ministry of Science, Innovation and Universities (Grant Number: PID2020-113614RB-C21 and RTI2018-095209-B-C22)Peer ReviewedPostprint (published version
- …