22,212 research outputs found

    Automatic Parallelization of Database Queries

    Get PDF
    Although automatic parallelization of conventional language programs is now widely accepted, relatively little emphasis has been placed on automatic parallelization of database query programs (sometimes referred to as “multiple queries” ). In this paper, we discuss the unique problems associated with automatic parallelization of database programs. From this discussion, we derive a complete approach to automatic parallelization of database programs. Beside integrating a number of existing techniques, our approach relies heavily on several new concepts, including the concepts of “algorithm-level” analysis and hybrid static/dynamic scheduling

    Adding Automatic Parallelization to Faust

    Get PDF
    International audienceFaust 0.9.9.5 introduces new compilation options to do automatic parallelization of code using OpenMP. This paper explains how the automatic parallelization is done and presents some benchmarks

    Dynamic Trace-Based Data Dependency Analysis for Parallelization of C Programs

    Get PDF
    Writing parallel code is traditionally considered a difficult task, even when it is tackled from the beginning of a project. In this paper, we demonstrate an innovative toolset that faces this challenge directly. It provides the software developers with profile data and directs them to possible top-level, pipeline-style parallelization opportunities for an arbitrary sequential C program. This approach is complementary to the methods based on static code analysis and automatic code rewriting and does not impose restrictions on the structure of the sequential code or the parallelization style, even though it is mostly aimed at coarse-grained task-level parallelization. The proposed toolset has been utilized to define parallel code organizations for a number of real-world representative applications and is based on and is provided as free source

    Workload-aware Automatic Parallelization for Multi-GPU DNN Training

    Full text link
    Deep neural networks (DNNs) have emerged as successful solutions for variety of artificial intelligence applications, but their very large and deep models impose high computational requirements during training. Multi-GPU parallelization is a popular option to accelerate demanding computations in DNN training, but most state-of-the-art multi-GPU deep learning frameworks not only require users to have an in-depth understanding of the implementation of the frameworks themselves, but also apply parallelization in a straight-forward way without optimizing GPU utilization. In this work, we propose a workload-aware auto-parallelization framework (WAP) for DNN training, where the work is automatically distributed to multiple GPUs based on the workload characteristics. We evaluate WAP using TensorFlow with popular DNN benchmarks (AlexNet and VGG-16), and show competitive training throughput compared with the state-of-the-art frameworks, and also demonstrate that WAP automatically optimizes GPU assignment based on the workload's compute requirements, thereby improving energy efficiency.Comment: This paper is accepted in ICASSP201
    • 

    corecore