24,895 research outputs found

    Comparison of CPU and Parabricks GPU Enabled Bioinformatics Software for High Throughput Clinical Genomic Applications

    Get PDF
    In recent years, high performance computing (HPC) has begun to revolutionize the architecture of software and servers to meet the ever-increasing demand for speed & efficiency. One of the ways this change is manifesting is the adoption of graphics processor units (GPUs). Used correctly, GPUS can increase throughput and decrease compute time for certain computational problems. Bioinformatics, an HPC dependent discipline, is no exception. As bioinformatics continues advance clinical care by sequencing patient’s DNA and RNA for diagnosis of diseases, there is an ever-increasing demand for faster data processing to improve clinical sequencing turnaround time. Parabricks, a GPU enabled bioinformatics software is one of the leaders in ‘lifting over’ common CPU bioinformatics tools to GPU architectures. In the present study, bioinformatics pipelines built with Parabricks GPU enabled software are compared with standard CPU bioinformatics software. Pipeline results and run performance comparisons are performed to show the impact this technology change can have for a medium sized computational cluster. The present study finds that Parabricks’ GPU workflows show a massive increase in overall efficiency by cutting overall run time by roughly 21x, cutting overall computational hours needed by 650x. Parabricks GPU workflows show a 99.5% variant call concordance rate when compared to clinically validated CPU workflows. Substitution of Parabricks GPU alignment into a clinically validated CPU based pipeline reduces the number of compute hours from 836 hours to 727 hours and returns the same results, showing CPU and GPU’s can be used together to reduce pipeline turnaround time & compute resource burden. Overall, integration of GPUs into bioinformatic pipelines leads to massive reduction of turnaround time, reduction of computation times, and increased throughput, with little to no sacrifice in overall output quality. The findings of this study show GPU based bioinformatic workflows, like Parabricks, could greatly improve whole genome sequencing accessibility for clinical use by reduction of testing turnaround time

    Dynamic Hardware Resource Management for Efficient Throughput Processing.

    Full text link
    High performance computing is evolving at a rapid pace, with throughput oriented processors such as graphics processing units (GPUs), substituting for traditional processors as the computational workhorse. Their adoption has seen a tremendous increase as they provide high peak performance and energy efficiency while maintaining a friendly programming interface. Furthermore, many existing desktop, laptop, tablet, and smartphone systems support accelerating non-graphics, data parallel workloads on their GPUs. However, the multitude of systems that use GPUs as an accelerator run different genres of data parallel applications, which have significantly contrasting runtime characteristics. GPUs use thousands of identical threads to efficiently exploit the on-chip hardware resources. Therefore, if one thread uses a resource (compute, bandwidth, data cache) more heavily, there will be significant contention for that resource. This contention will eventually saturate the performance of the GPU due to contention for the bottleneck resource,leaving other resources underutilized at the same time. Traditional policies of managing the massive hardware resources work adequately, on well designed traditional scientific style applications. However, these static policies, which are oblivious to the application’s resource requirement, are not efficient for the large spectrum of data parallel workloads with varying resource requirements. Therefore, several standard hardware policies such as using maximum concurrency, fixed operational frequency and round-robin style scheduling are not efficient for modern GPU applications. This thesis defines dynamic hardware resource management mechanisms which improve the efficiency of the GPU by regulating the hardware resources at runtime. The first step in successfully achieving this goal is to make the hardware aware of the application’s characteristics at runtime through novel counters and indicators. After this detection, dynamic hardware modulation provides opportunities for increased performance, improved energy consumption, or both, leading to efficient execution. The key mechanisms for modulating the hardware at runtime are dynamic frequency regulation, managing the amount of concurrency, managing the order of execution among different threads and increasing cache utilization. The resultant increased efficiency will lead to improved energy consumption of the systems that utilize GPUs while maintaining or improving their performance.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/113356/1/asethia_1.pd

    GPUs as Storage System Accelerators

    Full text link
    Massively multicore processors, such as Graphics Processing Units (GPUs), provide, at a comparable price, a one order of magnitude higher peak performance than traditional CPUs. This drop in the cost of computation, as any order-of-magnitude drop in the cost per unit of performance for a class of system components, triggers the opportunity to redesign systems and to explore new ways to engineer them to recalibrate the cost-to-performance relation. This project explores the feasibility of harnessing GPUs' computational power to improve the performance, reliability, or security of distributed storage systems. In this context, we present the design of a storage system prototype that uses GPU offloading to accelerate a number of computationally intensive primitives based on hashing, and introduce techniques to efficiently leverage the processing power of GPUs. We evaluate the performance of this prototype under two configurations: as a content addressable storage system that facilitates online similarity detection between successive versions of the same file and as a traditional system that uses hashing to preserve data integrity. Further, we evaluate the impact of offloading to the GPU on competing applications' performance. Our results show that this technique can bring tangible performance gains without negatively impacting the performance of concurrently running applications.Comment: IEEE Transactions on Parallel and Distributed Systems, 201

    Performance Analysis of a Novel GPU Computation-to-core Mapping Scheme for Robust Facet Image Modeling

    Get PDF
    Though the GPGPU concept is well-known in image processing, much more work remains to be done to fully exploit GPUs as an alternative computation engine. This paper investigates the computation-to-core mapping strategies to probe the efficiency and scalability of the robust facet image modeling algorithm on GPUs. Our fine-grained computation-to-core mapping scheme shows a significant performance gain over the standard pixel-wise mapping scheme. With in-depth performance comparisons across the two different mapping schemes, we analyze the impact of the level of parallelism on the GPU computation and suggest two principles for optimizing future image processing applications on the GPU platform

    Study of combining GPU/FPGA accelerators for high-performance computing

    Get PDF
    This contribution presents the performance modeling of a super desktop with GPU and FPGA accelerators, using OpenCL for the GPU and a high-level synthesis compiler for the FPGAs. The performance model is used to evaluate the different high-level synthesis optimizations, taking into account the resource usage, and to compare the compute power of the FPGA with the GP
    • …
    corecore