Motivation: The ability to generate massive amounts of sequencing data
continues to overwhelm the processing capability of existing algorithms and
compute infrastructures. In this work, we explore the use of hardware/software
co-design and hardware acceleration to significantly reduce the execution time
of short sequence alignment, a crucial step in analyzing sequenced genomes. We
introduce Shouji, a highly-parallel and accurate pre-alignment filter that
remarkably reduces the need for computationally-costly dynamic programming
algorithms. The first key idea of our proposed pre-alignment filter is to
provide high filtering accuracy by correctly detecting all common subsequences
shared between two given sequences. The second key idea is to design a hardware
accelerator that adopts modern FPGA (Field-Programmable Gate Array)
architectures to further boost the performance of our algorithm.
Results: Shouji significantly improves the accuracy of pre-alignment
filtering by up to two orders of magnitude compared to the state-of-the-art
pre-alignment filters, GateKeeper and SHD. Our FPGA-based accelerator is up to
three orders of magnitude faster than the equivalent CPU implementation of
Shouji. Using a single FPGA chip, we benchmark the benefits of integrating
Shouji with five state-of-the-art sequence aligners, designed for different
computing platforms. The addition of Shouji as a pre-alignment step reduces the
execution time of the five state-of-the-art sequence aligners by up to 18.8x.
Shouji can be adapted for any bioinformatics pipeline that performs sequence
alignment for verification. Unlike most existing methods that aim to accelerate
sequence alignment, Shouji does not sacrifice any of the aligner capabilities,
as it does not modify or replace the alignment step.
Availability: https://github.com/CMU-SAFARI/ShoujiComment: https://academic.oup.com/bioinformatics/advance-article-abstract/doi/10.1093/bioinformatics/btz234/5421509,
Bioinformatics Journal 201