2,582 research outputs found
Interstellar: Using Halide's Scheduling Language to Analyze DNN Accelerators
We show that DNN accelerator micro-architectures and their program mappings
represent specific choices of loop order and hardware parallelism for computing
the seven nested loops of DNNs, which enables us to create a formal taxonomy of
all existing dense DNN accelerators. Surprisingly, the loop transformations
needed to create these hardware variants can be precisely and concisely
represented by Halide's scheduling language. By modifying the Halide compiler
to generate hardware, we create a system that can fairly compare these prior
accelerators. As long as proper loop blocking schemes are used, and the
hardware can support mapping replicated loops, many different hardware
dataflows yield similar energy efficiency with good performance. This is
because the loop blocking can ensure that most data references stay on-chip
with good locality and the processing units have high resource utilization. How
resources are allocated, especially in the memory system, has a large impact on
energy and performance. By optimizing hardware resource allocation while
keeping throughput constant, we achieve up to 4.2X energy improvement for
Convolutional Neural Networks (CNNs), 1.6X and 1.8X improvement for Long
Short-Term Memories (LSTMs) and multi-layer perceptrons (MLPs), respectively.Comment: Published as a conference paper at ASPLOS 202
A Survey of FPGA Optimization Methods for Data Center Energy Efficiency
This article provides a survey of academic literature about field
programmable gate array (FPGA) and their utilization for energy efficiency
acceleration in data centers. The goal is to critically present the existing
FPGA energy optimization techniques and discuss how they can be applied to such
systems. To do so, the article explores current energy trends and their
projection to the future with particular attention to the requirements set out
by the European Code of Conduct for Data Center Energy Efficiency. The article
then proposes a complete analysis of over ten years of research in energy
optimization techniques, classifying them by purpose, method of application,
and impacts on the sources of consumption. Finally, we conclude with the
challenges and possible innovations we expect for this sector.Comment: Accepted for publication in IEEE Transactions on Sustainable
Computin
Real-Time Big Data: the JUNIPER Approach
REACTION 2014. 3rd International Workshop on Real-time and Distributed Computing in Emerging Applications. Rome, Italy. December 2nd, 2014.Cloud computing offers the possibility for Cyber-Physical Systems (CPS) to offload computation and utilise large stored data sets in order to increase the overall system utility. However, for cloud platforms and applications to be effective for CPS, they need to exhibit real-time behaviour so that some level of performance can be guaranteed to the CPS. This paper considers the infrastructure developed by the EU JUNIPER project for enabling real-time big data systems to be built so that appropriate guarantees can be given to the CPS components. The technologies developed include a real-time Java programming approach, hardware acceleration to provide performance, and operating system resource manage-ment (time and disk) based upon resource reservation in order to enhance timeliness.This work is partially funded by the European Union’s Seventh Framework Programme under grant agreement FP7-ICT-611731Publicad
- …