Search CORE

65 research outputs found

FPGA acceleration of DNA sequencing analysis and storage

Author: Arram James
Publication venue: Computing, Imperial College London
Publication date: 01/01/2018
Field of study

In this work we explore how Field-Programmable Gate Arrays (FPGAs) can be used to alleviate the data processing bottlenecks in DNA sequencing. We focus our efforts on accelerating the FM-index, a data structure used to solve the computationally intensive string matching problems found in DNA sequencing analysis such as short read alignment. The main contributions of this work are: 1) We accelerate the FM-index using FPGAs and develop several novel methods for reducing the memory bottleneck of the search algorithm. These methods include customising the FM-index structure according to the memory architecture of the FPGA platform and minimising the number of memory accesses through both architectural and algorithmic optimisations. 2) We present a new approach for accelerating approximate string matching using the backtracking FM-index. This approach makes use of specialised approximate string matching modules and a run-time reconfigurable architecture in order to achieve both high sensitivity and high performance. 3) We extend the FM-index search algorithm for reference-based compression and accelerate it using FPGAs. This accelerated design is integrated into fastqZip and fastaZip, two new tools that we have developed for the fast and effective compression of sequence data stored in the FASTQ and FASTA formats respectively. We implement our designs on the Maxeler Max4 Platform and show that they are able to outperform state-of-the-art DNA sequencing analysis software. For instance, our hardware-accelerated compression tool for FASTQ data is able to achieve a higher compression ratio than the best performing tool, fastqz, whilst the average compression and decompression speeds are 25 and 43 times faster respectively.Open Acces

Spiral - Imperial College Digital Repository

FPGA acceleration of reference-based compression for genomic data

Author: Arram J
Kaplan T
Luk W
Pflanzer M
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/12/2015
Field of study

One of the key challenges facing genomics today is efficiently storing the massive amounts of data generated by next-generation sequencing platforms. Reference-based compression is a popular strategy for reducing the size of genomic data, whereby sequence information is encoded as a mapping to a known reference sequence. Determining the mapping is a computationally intensive problem, and is the bottleneck of most reference-based compression tools currently available. This paper presents the first FPGA acceleration of reference-based compression for genomic data. We develop a new mapping algorithm based on the FM-index search operation which includes optimisations targeting the compression ratio and speed. Our hardware design is implemented on a Maxeler MPC-X2000 node comprising 8 Altera Stratix V FPGAs. When evaluated against compression tools currently available, our tool achieves a superior compression ratio, compression time, and energy consumption for both FASTA and FASTQ formats. For example, our tool achieves a 30% higher compression ratio and is 71.9 times faster than the fastqz tool

Crossref

Spiral - Imperial College Digital Repository

Epidemiologia molecular de isolados de Candida spp. obtidos de pacientes pediátricos com candidemia

Author: Arram Sohaila Boehm Ibrahim
Publication venue
Publication date: 01/04/2010
Field of study

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositório Digital Institucional da UFPR

Universidade Federal do Paraná

Spam detection using hybrid of artificial neural network and genetic algorithm

Author: Arram Anas W. A.
Publication venue
Publication date: 01/06/2013
Field of study

Spam detection is a significant problem which is considered by many researchers by various developed strategies. In this study, the popular performance measure is a classification accuracy which deals with false positive, false negative and accuracy. These metrics were evaluated under applying two supervised learning algorithms: hybrid of Artificial Neural Network (ANN) and Genetic Algorithm (GA), Support Vector Machine (SVM) based on classification of Email spam contents were evaluated and compared. In this study, a hybrid machine learning approach inspired by Artificial Neural Network (ANN) and Genetic Algorithm (GA) for effectively detect the spams. Comparisons have been done between classical ANN and Improved ANN-GA and between ANN-GA and SVM to show which algorithm has the best performance in spam detection. These algorithms were trained and tested on a 3 set of 4061 E-mail in which 1813 were spam and 2788 were nonspam. Results showed that the proposed ANN-GA technique gave better result compare to classical ANN and SVM techniques. The results from proposed ANNGA gave 93.71% accuracy, while classical ANN gave 92.08% accuracy and SVM technique gave the worst accuracy which was 79.82. The experimental result suggest that the effectiveness of proposed ANN-GA model is promising and this study provided a new method to efficiently train ANN for spam detection

Universiti Teknologi Malaysia Institutional Repository

A questao da competiçao nas aulas de Educaçao Física de 1ª à 4ª série

Author: Arram Siham Boehm Ibrahim
Publication venue
Publication date: 01/01/1994
Field of study

Orientador: Paulo Air MicoskiMonografia (licenciatura) - Universidade Federal do Paraná. Setor de Ciências Biológicas. Curso de Educação Físic

Repositório Digital Institucional da UFPR

Universidade Federal do Paraná

Inherited causes of combined vision and hearing loss: clinical features and molecular genetics

Author: Arram Elizabeth
Georgiou Michalis
Guimaraes Thales Antonio Cabral de
Michaelides Michel
Shakarchi Ahmed F
Publication venue: BMJ Publishing Group
Publication date: 26/09/2022
Field of study

Combined vision and hearing loss, also known as dual sensory impairment, can occur in several genetic conditions, including ciliopathies such as Usher and Bardet-Biedl syndrome, mitochondrial DNA disorders and systemic diseases, such as CHARGE, Stickler, Waardenburg, Alport and Alstrom syndrome. The retinal phenotype may point to the diagnosis of such disorders. Herein, we aim to provide a comprehensive review of the molecular genetics and clinical features of the most common non-chromosomal inherited disorders to cause dual sensory impairment

UCL Discovery

Reconfigurable acceleration of genetic sequence alignment: A survey of two decades of efforts

Author: abelsson
arram
buhler
burrows
court
cret
draghicescu
ferragina
jacobi
li
li
lin
preu?er
preu?er
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 04/09/2017
Field of study

Genetic sequence alignment has always been a computational challenge in bioinformatics. Depending on the problem size, software-based aligners can take multiple CPU-days to process the sequence data, creating a bottleneck point in bioinformatic analysis flow. Reconfigurable accelerator can achieve high performance for such computation by providing massive parallelism, but at the expense of programming flexibility and thus has not been commensurately used by practitioners. Therefore, this paper aims to provide a thorough survey of the proposed accelerators by giving a qualitative categorization based on their algorithms and speedup. A comprehensive comparison between work is also presented so as to guide selection for biologist, and to provide insight on future research direction for FPGA scientists

Crossref

Spiral - Imperial College Digital Repository

FPGA Acceleration of Reference-Based Compression for Genomic Data

Author: James Arram
Moritz Pflanzer
Thomas Kaplan
Wayne Luk
Publication venue
Publication date: 23/04/2020
Field of study

Abstract-One of the key challenges facing genomics today is efficiently storing the massive amounts of data generated by nextgeneration sequencing platforms. Reference-based compression is a popular strategy for reducing the size of genomic data, whereby sequence information is encoded as a mapping to a known reference sequence. Determining the mapping is a computationally intensive problem, and is the bottleneck of most referencebased compression tools currently available. This paper presents the first FPGA acceleration of reference-based compression for genomic data. We develop a new mapping algorithm based on the FM-index search operation which includes optimisations targeting the compression ratio and speed. Our hardware design is implemented on a Maxeler MPC-X2000 node comprising 8 Altera Stratix V FPGAs. When evaluated against compression tools currently available, our tool achieves a superior compression ratio, compression time, and energy consumption for both FASTA and FASTQ formats. For example, our tool achieves a 30% higher compression ratio and is 71.9 times faster than the fastqz tool

CiteSeerX

Breast cancer diagnosis using the fast learning network algorithm

Author: Anas Arram
Fahad Taha AL-Dhief
Masri Ayob
Musatafa Abbas Abbood Albadr
Sabrina Tiun
Sura Khalaf
Publication venue: 'Frontiers Media SA'
Publication date: 01/04/2023
Field of study

The use of machine learning (ML) and data mining algorithms in the diagnosis of breast cancer (BC) has recently received a lot of attention. The majority of these efforts, however, still require improvement since either they were not statistically evaluated or they were evaluated using insufficient assessment metrics, or both. One of the most recent and effective ML algorithms, fast learning network (FLN), may be seen as a reputable and efficient approach for classifying data; however, it has not been applied to the problem of BC diagnosis. Therefore, this study proposes the FLN algorithm in order to improve the accuracy of the BC diagnosis. The FLN algorithm has the capability to a) eliminate overfitting, b) solve the issues of both binary and multiclass classification, and c) perform like a kernel-based support vector machine with a structure of the neural network. In this study, two BC databases (Wisconsin Breast Cancer Database (WBCD) and Wisconsin Diagnostic Breast Cancer (WDBC)) were used to assess the performance of the FLN algorithm. The results of the experiment demonstrated the great performance of the suggested FLN method, which achieved an average of accuracy 98.37%, precision 95.94%, recall 99.40%, F-measure 97.64%, G-mean 97.65%, MCC 96.44%, and specificity 97.85% using the WBCD, as well as achieved an average of accuracy 96.88%, precision 94.84%, recall 96.81%, F-measure 95.80%, G-mean 95.81%, MCC 93.35%, and specificity 96.96% using the WDBC database. This suggests that the FLN algorithm is a reliable classifier for diagnosing BC and may be useful for resolving other application-related problems in the healthcare sector

Directory of Open Access Journals

Hardware acceleration of genomics data analysis: challenges and opportunities

Author: Abdallah
Al Kawam
Al-Absi
Alser
Alser
Altschul
Angerer
Antipov
Arram
Arram
Audano
Ayling
Bahrebar
Banerjee
Bao
Bao
Barron
Behjati
Bohannan
Brittain
Broad Institute
Broad Institute
Cardon
Carrillo
Carrillo
Challis
Chen
Chen
Ciccolella
Cingolani
Clark
Croville
Das
Denti
Doan
Dobin
Du
Fei
Fleckhaus
Fonseca
Genome Research Ltd
Ghurye
Golosova
Goodwin
Goyal
Gök
Hackl
Hasnain
Houtgast
Hu
Illumina Inc
Jackson
Javed
Joardar
Joshi
Jourdren
Kaplan
Kent
Kim
Kim
Kosuri
Langmead
Langmead
Langmead
Lesk
Li
Li
Li
Li
Li
Li
Li
Lightbody
Lightbody
Liu
Liu
Liu
Lv
Margulies
Maruyama
Mcvicar
Milward
Muir
NCBI
Niedringhaus
Nsame
Orth
Oxford Nanopore Technologies
Park
Patel
Payne
Peddie
Rizzo
Robinson
Sarkar
Sboner
Schatz
Shang
Shang
Sharifi
Subbulakshmi
Sundfeld
Tian
Tsai
Turakhia
Turakhia
Wang
Wang
Ward
xilinx
Yano
Zaharia
Zokaee
Publication venue: 'Oxford University Press (OUP)'
Publication date: 25/05/2021
Field of study

Crossref

Ulster University's Research Portal