Search CORE

95 research outputs found

QuASeR -- Quantum Accelerated De Novo DNA Sequence Reconstruction

Author: Al-Ars Zaid
Bertels Koen
Sarkar Aritra
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 10/04/2020
Field of study

In this article, we present QuASeR, a reference-free DNA sequence reconstruction implementation via de novo assembly on both gate-based and quantum annealing platforms. Each one of the four steps of the implementation (TSP, QUBO, Hamiltonians and QAOA) is explained with simple proof-of-concept examples to target both the genomics research community and quantum application developers in a self-contained manner. The details of the implementation are discussed for the various layers of the quantum full-stack accelerator design. We also highlight the limitations of current classical simulation and available quantum hardware systems. The implementation is open-source and can be found on https://github.com/prince-ph0en1x/QuASeR.Comment: 24 page

arXiv.org e-Print Archive

Directory of Open Access Journals

An Overview of Hardware-Based Acceleration of Biological Sequence Alignment

Author: Laiq Hasan
Zaid Al-Ars
Publication venue: 'IntechOpen'
Publication date: 02/09/2011
Field of study

IntechOpen

DOPA: GPU-based protein alignment using database and memory access optimizations

Author: Al-Ars Zaid
Hasan Laiq
Kentie Marijn
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Background Smith-Waterman (S-W) algorithm is an optimal sequence alignment method for biological databases, but its computational complexity makes it too slow for practical purposes. Heuristics based approximate methods like FASTA and BLAST provide faster solutions but at the cost of reduced accuracy. Also, the expanding volume and varying lengths of sequences necessitate performance efficient restructuring of these databases. Thus to come up with an accurate and fast solution, it is highly desired to speed up the S-W algorithm. Findings This paper presents a high performance protein sequence alignment implementation for Graphics Processing Units (GPUs). The new implementation improves performance by optimizing the database organization and reducing the number of memory accesses to eliminate bandwidth bottlenecks. The implementation is called Database Optimized Protein Alignment (DOPA) and it achieves a performance of 21.4 Giga Cell Updates Per Second (GCUPS), which is 1.13 times better than the fastest GPU implementation to date. Conclusions In the new GPU-based implementation for protein sequence alignment (DOPA), the database is organized in equal length sequence sets. This equally distributes the workload among all the threads on the GPU's multiprocessors. The result is an improved performance which is better than the fastest available GPU implementation.MicroelectronicsElectrical Engineering, Mathematics and Computer Scienc

TU Delft Repository

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Benchmarking Apache Arrow Flight -- A wire-speed protocol for data transfer, querying and microservices

Author: Ahmad Tanveer
Ars Zaid Al
Hofstee H. Peter
Publication venue
Publication date: 01/01/2022
Field of study

Moving structured data between different big data frameworks and/or data warehouses/storage systems often cause significant overhead. Most of the time more than 80\% of the total time spent in accessing data is elapsed in serialization/de-serialization step. Columnar data formats are gaining popularity in both analytics and transactional databases. Apache Arrow, a unified columnar in-memory data format promises to provide efficient data storage, access, manipulation and transport. In addition, with the introduction of the Arrow Flight communication capabilities, which is built on top of gRPC, Arrow enables high performance data transfer over TCP networks. Arrow Flight allows parallel Arrow RecordBatch transfer over networks in a platform and language-independent way, and offers high performance, parallelism and security based on open-source standards. In this paper, we bring together some recently implemented use cases of Arrow Flight with their benchmarking results. These use cases include bulk Arrow data transfer, querying subsystems and Flight as a microservice integration into different frameworks to show the throughput and scalability results of this protocol. We show that Flight is able to achieve up to 6000 MB/s and 4800 MB/s throughput for DoGet() and DoPut() operations respectively. On Mellanox ConnectX-3 or Connect-IB interconnect nodes Flight can utilize upto 95\% of the total available bandwidth. Flight is scalable and can use upto half of the available system cores efficiently for a bidirectional communication. For query systems like Dremio, Flight is order of magnitude faster than ODBC and turbodbc protocols. Arrow Flight based implementation on Dremio performs 20x and 30x better as compared to turbodbc and ODBC connections respectively

arXiv.org e-Print Archive

TU Delft Repository

DFL: High-Performance Blockchain-Based Federated Learning

Author: Al-Ars Zaid
Guo Zhuoran
Tian Yongding
Zhang Jiaxuan
Publication venue
Publication date: 28/10/2021
Field of study

Many researchers are trying to replace the aggregation server in federated learning with a blockchain system to achieve better privacy, robustness and scalability. In this case, clients will upload their updated models to the blockchain ledger, and use a smart contract on the blockchain system to perform model averaging. However, running machine learning applications on the blockchain is almost impossible because a blockchain system, which usually takes over half minute to generate a block, is extremely slow and unable to support machine learning applications. This paper proposes a completely new public blockchain architecture called DFL, which is specially optimized for distributed federated machine learning. This architecture inherits most traditional blockchain merits and achieves extremely high performance with low resource consumption by waiving global consensus. To characterize the performance and robustness of our architecture, we implement the architecture as a prototype and test it on a physical four-node network. To test more nodes and more complex situations, we build a simulator to simulate the network. The LeNet results indicate our system can reach over 90% accuracy for non-I.I.D. datasets even while facing model poisoning attacks, with the blockchain consuming less than 5% of hardware resources.Comment: 11 pages, 17 figure

arXiv.org e-Print Archive