Search CORE

52 research outputs found

Reconfigurable computing for large-scale graph traversal algorithms

Author: Betkaoui Brahim
Publication venue: Computing, Imperial College London
Publication date: 01/09/2014
Field of study

This thesis proposes a reconfigurable computing approach for supporting parallel processing in large-scale graph traversal algorithms. Our approach is based on a reconfigurable hardware architecture which exploits the capabilities of both FPGAs (Field-Programmable Gate Arrays) and a multi-bank parallel memory subsystem. The proposed methodology to accelerate graph traversal algorithms has been applied to three case studies, revealing that application-specific hardware customisations can benefit performance. A summary of our four contributions is as follows. First, a reconfigurable computing approach to accelerate large-scale graph traversal algorithms. We propose a reconfigurable hardware architecture which decouples computation and communication while keeping multiple memory requests in flight at any given time, taking advantage of the high bandwidth of multi-bank memory subsystems. Second, a demonstration of the effectiveness of our approach through two case studies: the breadth-first search algorithm, and a graphlet counting algorithm from bioinformatics. Both case studies involve graph traversal, but each of them adopts a different graph data representation. Third, a method for using on-chip memory resources in FPGAs to reduce off-chip memory accesses for accelerating graph traversal algorithms, through a case-study of the All-Pairs Shortest-Paths algorithm. This case study has been applied to process human brain network data. Fourth, an evaluation of an approach based on instruction-set extension for FPGA design against many-core GPUs (Graphics Processing Units), based on a set of benchmarks with different memory access characteristics. It is shown that while GPUs excel at streaming applications, the proposed approach can outperform GPUs in applications with poor locality characteristics, such as graph traversal problems.Open Acces

Spiral - Imperial College Digital Repository

Hardware acceleration of genomics data analysis: challenges and opportunities

Author: Abdallah
Al Kawam
Al-Absi
Alser
Alser
Altschul
Angerer
Antipov
Arram
Arram
Audano
Ayling
Bahrebar
Banerjee
Bao
Bao
Barron
Behjati
Bohannan
Brittain
Broad Institute
Broad Institute
Cardon
Carrillo
Carrillo
Challis
Chen
Chen
Ciccolella
Cingolani
Clark
Croville
Das
Denti
Doan
Dobin
Du
Fei
Fleckhaus
Fonseca
Genome Research Ltd
Ghurye
Golosova
Goodwin
Goyal
Gök
Hackl
Hasnain
Houtgast
Hu
Illumina Inc
Jackson
Javed
Joardar
Joshi
Jourdren
Kaplan
Kent
Kim
Kim
Kosuri
Langmead
Langmead
Langmead
Lesk
Li
Li
Li
Li
Li
Li
Li
Lightbody
Lightbody
Liu
Liu
Liu
Lv
Margulies
Maruyama
Mcvicar
Milward
Muir
NCBI
Niedringhaus
Nsame
Orth
Oxford Nanopore Technologies
Park
Patel
Payne
Peddie
Rizzo
Robinson
Sarkar
Sboner
Schatz
Shang
Shang
Sharifi
Subbulakshmi
Sundfeld
Tian
Tsai
Turakhia
Turakhia
Wang
Wang
Ward
xilinx
Yano
Zaharia
Zokaee
Publication venue: 'Oxford University Press (OUP)'
Publication date: 25/05/2021
Field of study

Crossref

Ulster University's Research Portal

Using reconfigurable computing technology to accelerate matrix decomposition and applications

Author: Wang Xinying
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2016
Field of study

Matrix decomposition plays an increasingly significant role in many scientific and engineering applications. Among numerous techniques, Singular Value Decomposition (SVD) and Eigenvalue Decomposition (EVD) are widely used as factorization tools to perform Principal Component Analysis for dimensionality reduction and pattern recognition in image processing, text mining and wireless communications, while QR Decomposition (QRD) and sparse LU Decomposition (LUD) are employed to solve the dense or sparse linear system of equations in bioinformatics, power system and computer vision. Matrix decompositions are computationally expensive and their sequential implementations often fail to meet the requirements of many time-sensitive applications. The emergence of reconfigurable computing has provided a flexible and low-cost opportunity to pursue high-performance parallel designs, and the use of FPGAs has shown promise in accelerating this class of computation. In this research, we have proposed and implemented several highly parallel FPGA-based architectures to accelerate matrix decompositions and their applications in data mining and signal processing. Specifically, in this dissertation we describe the following contributions: • We propose an efficient FPGA-based double-precision floating-point architecture for EVD, which can efficiently analyze large-scale matrices. • We implement a floating-point Hestenes-Jacobi architecture for SVD, which is capable of analyzing arbitrary sized matrices. • We introduce a novel deeply pipelined reconfigurable architecture for QRD, which can be dynamically configured to perform either Householder transformation or Givens rotation in a manner that takes advantage of the strengths of each. • We design a configurable architecture for sparse LUD that supports both symmetric and asymmetric sparse matrices with arbitrary sparsity patterns. • By further extending the proposed hardware solution for SVD, we parallelize a popular text mining tool-Latent Semantic Indexing with an FPGA-based architecture. • We present a configurable architecture to accelerate Homotopy l1-minimization, in which the modification of the proposed FPGA architecture for sparse LUD is used at its core to parallelize both Cholesky decomposition and rank-1 update. Our experimental results using an FPGA-based acceleration system indicate the efficiency of our proposed novel architectures, with application and dimension-dependent speedups over an optimized software implementation that range from 1.5ÃÂ to 43.6ÃÂ in terms of computation time

Digital Repository @ Iowa State University (ISU)

Accelerating molecular dynamics simulations with configurable circuits

Author: Allen
Athanassoula
Azizi
Bargiel
Beck
Brunner
Butts
Chang
DeMarco
DeMarco
Friesner
Fukushige
Hamada
Hamada
Hansson
Kawai
Kuberka
M.C. Herbordt
Phillips
Sagui
Spurzem
T. VanCourt
Taiji
Taufer
Tredennick
VanCourt
VanCourt
Y. Gu
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 01/01/2006
Field of study

Crossref

FPGA acceleration of sequence analysis tools in bioinformatics

Author: Mahram Atabak
Publication venue: Boston University
Publication date: 01/01/2013
Field of study

Thesis (Ph.D.)--Boston UniversityWith advances in biotechnology and computing power, biological data are being produced at an exceptional rate. The purpose of this study is to analyze the application of FPGAs to accelerate high impact production biosequence analysis tools. Compared with other alternatives, FPGAs offer huge compute power, lower power consumption, and reasonable flexibility. BLAST has become the de facto standard in bioinformatic approximate string matching and so its acceleration is of fundamental importance. It is a complex highly-optimized system, consisting of tens of thousands of lines of code and a large number of heuristics. Our idea is to emulate the main phases of its algorithm on FPGA. Utilizing our FPGA engine, we quickly reduce the size of the database to a small fraction, and then use the original code to process the query. Using a standard FPGA-based system, we achieved 12x speedup over a highly optimized multithread reference code. Multiple Sequence Alignment (MSA)--the extension of pairwise Sequence Alignment to multiple Sequences--is critical to solve many biological problems. Previous attempts to accelerate Clustal-W, the most commonly used MSA code, have directly mapped a portion of the code to the FPGA. We use a new approach: we apply prefiltering of the kind commonly used in BLAST to perform the initial all-pairs alignments. This results in a speedup of from 8Ox to 190x over the CPU code (8 cores). The quality is comparable to the original according to a commonly used benchmark suite evaluated with respect to multiple distance metrics. The challenge in FPGA-based acceleration is finding a suitable application mapping. Unfortunately many software heuristics do not fall into this category and so other methods must be applied. One is restructuring: an entirely new algorithm is applied. Another is to analyze application utilization and develop accuracy/performance tradeoffs. Using our prefiltering approach and novel FPGA programming models we have achieved significant speedup over reference programs. We have applied approximation, seeding, and filtering to this end. The bulk of this study is to introduce the pros and cons of these acceleration models for biosequence analysis tools

Boston University Institutional Repository (OpenBU)

FPGA acceleration of DNA sequence alignment: design analysis and optimization

Author: Ng Ho-Cheung
Publication venue: Computing, Imperial College London
Publication date: 01/01/2022
Field of study

Existing FPGA accelerators for short read mapping often fail to utilize the complete biological information in sequencing data for simple hardware design, leading to missed or incorrect alignment. In this work, we propose a runtime reconfigurable alignment pipeline that considers all information in sequencing data for the biologically accurate acceleration of short read mapping. We focus our efforts on accelerating two string matching techniques: FM-index and the Smith-Waterman algorithm with the affine-gap model which are commonly used in short read mapping. We further optimize the FPGA hardware using a design analyzer and merger to improve alignment performance. The contributions of this work are as follows. 1. We accelerate the exact-match and mismatch alignment by leveraging the FM-index technique. We optimize memory access by compressing the data structure and interleaving the access with multiple short reads. The FM-index hardware also considers complete information in the read data to maximize accuracy. 2. We propose a seed-and-extend model to accelerate alignment with indels. The FM-index hardware is extended to support the seeding stage while a Smith-Waterman implementation with the affine-gap model is developed on FPGA for the extension stage. This model can improve the efficiency of indel alignment with comparable accuracy versus state-of-the-art software. 3. We present an approach for merging multiple FPGA designs into a single hardware design, so that multiple place-and-route tasks can be replaced by a single task to speed up functional evaluation of designs. We first experiment with this approach to demonstrate its feasibility for different designs. Then we apply this approach to optimize one of the proposed FPGA aligners for better alignment performance.Open Acces

Spiral - Imperial College Digital Repository

Energy Efficiency of Sequence Alignment Tools - Software and Hardware Perspectives

Author: Griessl René
Hagemeyer Jens
Kierzynka Michal
Kosmann Lars
Krupop Stefan
Oleksiak Ariel
Peykanu Meysam
vor dem Berge Micha
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

Kierzynka M, Kosmann L, vor dem Berge M, et al. Energy Efficiency of Sequence Alignment Tools - Software and Hardware Perspectives. Future Generation Computer Systems. 2016;67:455-465

Publications at Bielefeld University