95 research outputs found

    QuASeR -- Quantum Accelerated De Novo DNA Sequence Reconstruction

    Full text link
    In this article, we present QuASeR, a reference-free DNA sequence reconstruction implementation via de novo assembly on both gate-based and quantum annealing platforms. Each one of the four steps of the implementation (TSP, QUBO, Hamiltonians and QAOA) is explained with simple proof-of-concept examples to target both the genomics research community and quantum application developers in a self-contained manner. The details of the implementation are discussed for the various layers of the quantum full-stack accelerator design. We also highlight the limitations of current classical simulation and available quantum hardware systems. The implementation is open-source and can be found on https://github.com/prince-ph0en1x/QuASeR.Comment: 24 page

    DOPA: GPU-based protein alignment using database and memory access optimizations

    Get PDF
    Background Smith-Waterman (S-W) algorithm is an optimal sequence alignment method for biological databases, but its computational complexity makes it too slow for practical purposes. Heuristics based approximate methods like FASTA and BLAST provide faster solutions but at the cost of reduced accuracy. Also, the expanding volume and varying lengths of sequences necessitate performance efficient restructuring of these databases. Thus to come up with an accurate and fast solution, it is highly desired to speed up the S-W algorithm. Findings This paper presents a high performance protein sequence alignment implementation for Graphics Processing Units (GPUs). The new implementation improves performance by optimizing the database organization and reducing the number of memory accesses to eliminate bandwidth bottlenecks. The implementation is called Database Optimized Protein Alignment (DOPA) and it achieves a performance of 21.4 Giga Cell Updates Per Second (GCUPS), which is 1.13 times better than the fastest GPU implementation to date. Conclusions In the new GPU-based implementation for protein sequence alignment (DOPA), the database is organized in equal length sequence sets. This equally distributes the workload among all the threads on the GPU's multiprocessors. The result is an improved performance which is better than the fastest available GPU implementation.MicroelectronicsElectrical Engineering, Mathematics and Computer Scienc

    Benchmarking Apache Arrow Flight -- A wire-speed protocol for data transfer, querying and microservices

    Full text link
    Moving structured data between different big data frameworks and/or data warehouses/storage systems often cause significant overhead. Most of the time more than 80\% of the total time spent in accessing data is elapsed in serialization/de-serialization step. Columnar data formats are gaining popularity in both analytics and transactional databases. Apache Arrow, a unified columnar in-memory data format promises to provide efficient data storage, access, manipulation and transport. In addition, with the introduction of the Arrow Flight communication capabilities, which is built on top of gRPC, Arrow enables high performance data transfer over TCP networks. Arrow Flight allows parallel Arrow RecordBatch transfer over networks in a platform and language-independent way, and offers high performance, parallelism and security based on open-source standards. In this paper, we bring together some recently implemented use cases of Arrow Flight with their benchmarking results. These use cases include bulk Arrow data transfer, querying subsystems and Flight as a microservice integration into different frameworks to show the throughput and scalability results of this protocol. We show that Flight is able to achieve up to 6000 MB/s and 4800 MB/s throughput for DoGet() and DoPut() operations respectively. On Mellanox ConnectX-3 or Connect-IB interconnect nodes Flight can utilize upto 95\% of the total available bandwidth. Flight is scalable and can use upto half of the available system cores efficiently for a bidirectional communication. For query systems like Dremio, Flight is order of magnitude faster than ODBC and turbodbc protocols. Arrow Flight based implementation on Dremio performs 20x and 30x better as compared to turbodbc and ODBC connections respectively

    DFL: High-Performance Blockchain-Based Federated Learning

    Full text link
    Many researchers are trying to replace the aggregation server in federated learning with a blockchain system to achieve better privacy, robustness and scalability. In this case, clients will upload their updated models to the blockchain ledger, and use a smart contract on the blockchain system to perform model averaging. However, running machine learning applications on the blockchain is almost impossible because a blockchain system, which usually takes over half minute to generate a block, is extremely slow and unable to support machine learning applications. This paper proposes a completely new public blockchain architecture called DFL, which is specially optimized for distributed federated machine learning. This architecture inherits most traditional blockchain merits and achieves extremely high performance with low resource consumption by waiving global consensus. To characterize the performance and robustness of our architecture, we implement the architecture as a prototype and test it on a physical four-node network. To test more nodes and more complex situations, we build a simulator to simulate the network. The LeNet results indicate our system can reach over 90% accuracy for non-I.I.D. datasets even while facing model poisoning attacks, with the blockchain consuming less than 5% of hardware resources.Comment: 11 pages, 17 figure
    • …
    corecore