Dataflow acceleration of Smith-Waterman with Traceback for high throughput Next Generation Sequencing

Abstract

Smith-Waterman algorithm is widely adopted bymost popular DNA sequence aligners. The inherent algorithmcomputational intensity and the vast amount of NGS input datait operates on, create a bottleneck in genomic analysis flows forshort-read alignment. FPGA architectures have been extensivelyleveraged to alleviate the problem, each one adopting a differentapproach. In existing solutions, effective co-design of the NGSshort-read alignment still remains an open issue, mainly due tonarrow view on real integration aspects, such as system widecommunication and accelerator call overheads. In this paper, wepropose a dataflow architecture for Smith-Waterman Matrix-filland Traceback alignment stages, to perform short-read alignmenton NGS data. The architectural decision of moving both stages onchip extinguishes the communication overhead, and coupled withradical software restructuring, allows for efficient integration intowidely-used Bowtie2 aligner. This approach delivers×18 speedupover the respective Bowtie2 standalone components, while our co-designed Bowtie2 demonstrates a 35% boost in performance

    Similar works