A Recent (2020) Comparative Analysis of Genome Aligners Shows HISAT2 and BWA are Among the Best Tools

Abstract

Genome aligners are an important tool in bioinformatics research as they can be used to detect gene variants to create higher crop yields, detect abnormal gene production in cancer cell lines, or identify weaknesses in a newly discovered pathogen. Aligners work by taking sequenced DNA or RNA and mapping these reads to their corresponding location in a reference genome. Although beneficial as a tool, choosing which aligner to use for a project is often a difficult decision due to the large number of tools available and each one claiming to be the best at what it does. The goal of this project is to determine which aligner performs the best in a controlled environment using the default settings for six of the most used genome aligners: Bowtie2 (using both end-to-end and local alignment modes), Burrows-Wheeler Aligner (BWA), Hierarchical Indexing for Spliced Alignment of Transcripts (HISAT2), MUMmer4, Spliced Transcripts Alignment to a Reference (STAR), and TopHat2. Each aligner was run using 48 geographically distinct samples of Erysiphe necator, more commonly known as powdery mildew. Alignment results were assessed based on three major criteria: 1) the number of reads successfully mapped to the reference genome, 2) their runtimes using a varying number of cores, and 3) the percentage of the full transcriptome covered. Aligners were further analyzed for potential biases in the types of genes that were unable to be mapped. The results for each aligner were compared against one another to determine the aligner which had the best performance on the provided dataset. The two best performing aligners were BWA, which achieved the highest alignment rate, and HISAT2, which achieved the fastest runtime. Overall, HISAT2 was determined to be the better aligner of the two as both aligners had similar transcriptome coverage regardless of alignment rate

    Similar works