2,582 research outputs found

    Interstellar: Using Halide's Scheduling Language to Analyze DNN Accelerators

    Full text link
    We show that DNN accelerator micro-architectures and their program mappings represent specific choices of loop order and hardware parallelism for computing the seven nested loops of DNNs, which enables us to create a formal taxonomy of all existing dense DNN accelerators. Surprisingly, the loop transformations needed to create these hardware variants can be precisely and concisely represented by Halide's scheduling language. By modifying the Halide compiler to generate hardware, we create a system that can fairly compare these prior accelerators. As long as proper loop blocking schemes are used, and the hardware can support mapping replicated loops, many different hardware dataflows yield similar energy efficiency with good performance. This is because the loop blocking can ensure that most data references stay on-chip with good locality and the processing units have high resource utilization. How resources are allocated, especially in the memory system, has a large impact on energy and performance. By optimizing hardware resource allocation while keeping throughput constant, we achieve up to 4.2X energy improvement for Convolutional Neural Networks (CNNs), 1.6X and 1.8X improvement for Long Short-Term Memories (LSTMs) and multi-layer perceptrons (MLPs), respectively.Comment: Published as a conference paper at ASPLOS 202

    A Survey of FPGA Optimization Methods for Data Center Energy Efficiency

    Get PDF
    This article provides a survey of academic literature about field programmable gate array (FPGA) and their utilization for energy efficiency acceleration in data centers. The goal is to critically present the existing FPGA energy optimization techniques and discuss how they can be applied to such systems. To do so, the article explores current energy trends and their projection to the future with particular attention to the requirements set out by the European Code of Conduct for Data Center Energy Efficiency. The article then proposes a complete analysis of over ten years of research in energy optimization techniques, classifying them by purpose, method of application, and impacts on the sources of consumption. Finally, we conclude with the challenges and possible innovations we expect for this sector.Comment: Accepted for publication in IEEE Transactions on Sustainable Computin

    Real-Time Big Data: the JUNIPER Approach

    Get PDF
    REACTION 2014. 3rd International Workshop on Real-time and Distributed Computing in Emerging Applications. Rome, Italy. December 2nd, 2014.Cloud computing offers the possibility for Cyber-Physical Systems (CPS) to offload computation and utilise large stored data sets in order to increase the overall system utility. However, for cloud platforms and applications to be effective for CPS, they need to exhibit real-time behaviour so that some level of performance can be guaranteed to the CPS. This paper considers the infrastructure developed by the EU JUNIPER project for enabling real-time big data systems to be built so that appropriate guarantees can be given to the CPS components. The technologies developed include a real-time Java programming approach, hardware acceleration to provide performance, and operating system resource manage-ment (time and disk) based upon resource reservation in order to enhance timeliness.This work is partially funded by the European Union’s Seventh Framework Programme under grant agreement FP7-ICT-611731Publicad
    • …
    corecore