4 research outputs found

    ApHMM: Accelerating Profile Hidden Markov Models for Fast and Energy-Efficient Genome Analysis

    Full text link
    Profile hidden Markov models (pHMMs) are widely employed in various bioinformatics applications to identify similarities between biological sequences, such as DNA or protein sequences. In pHMMs, sequences are represented as graph structures. These probabilities are subsequently used to compute the similarity score between a sequence and a pHMM graph. The Baum-Welch algorithm, a prevalent and highly accurate method, utilizes these probabilities to optimize and compute similarity scores. However, the Baum-Welch algorithm is computationally intensive, and existing solutions offer either software-only or hardware-only approaches with fixed pHMM designs. We identify an urgent need for a flexible, high-performance, and energy-efficient HW/SW co-design to address the major inefficiencies in the Baum-Welch algorithm for pHMMs. We introduce ApHMM, the first flexible acceleration framework designed to significantly reduce both computational and energy overheads associated with the Baum-Welch algorithm for pHMMs. ApHMM tackles the major inefficiencies in the Baum-Welch algorithm by 1) designing flexible hardware to accommodate various pHMM designs, 2) exploiting predictable data dependency patterns through on-chip memory with memoization techniques, 3) rapidly filtering out negligible computations using a hardware-based filter, and 4) minimizing redundant computations. ApHMM achieves substantial speedups of 15.55x - 260.03x, 1.83x - 5.34x, and 27.97x when compared to CPU, GPU, and FPGA implementations of the Baum-Welch algorithm, respectively. ApHMM outperforms state-of-the-art CPU implementations in three key bioinformatics applications: 1) error correction, 2) protein family search, and 3) multiple sequence alignment, by 1.29x - 59.94x, 1.03x - 1.75x, and 1.03x - 1.95x, respectively, while improving their energy efficiency by 64.24x - 115.46x, 1.75x, 1.96x.Comment: Accepted to ACM TAC

    Acceleration of the Pair-HMM forward algorithm on FPGA with cloud integration for GATK

    No full text
    The Pair-HMM forward algorithm is an essential algorithm found in many genomic related analyses. The high number of floating point operations in the algorithm makes it one of the main contributors to the compute time of analysis pipelines. To speed-up computations we propose an FPGA-based hardware accelerator for the Amazon AWS F1 Cloud platform. The accelerator is open source and has been tested within the popular Genomic Analysis Toolkit (GATK) pipeline. The accelerator achieved up to 15 × speed-up against the software implementation when used in-pipeline. The accelerator has also been tested in the experimental Spark (distributed) version of the GATK HaplotypeCaller tool. An in-depth analysis of the compute time contributions allowed to point out the main bottlenecks for accelerators in the GATK pipeline, resulting in a hybrid CPU-FPGA solution to best exploit both resources
    corecore